Video Understanding Data Enters Industrial-Scale Supply
Apple Proves Human Judgment Irreplaceable
This week scanned 86 HF orgs · 50 GitHub orgs · 71 blogs · 125 X accounts
29 datasets in one week, video multimodal data enters systematic supply [P0]; talent turbulence clashes with commercial expansion [P0]; commercial expansion and safety controversies escalate in parallel [P1]. Top data demand signal this week: Video Understanding / Tracking Data.
Key Findings
This week's 5 high commercial value findings
Allen AI released 29 datasets under the Molmo2 brand this week, nearly all focused on the video understanding task pipeline: molmo2-single-object-track (single object tracking, 2/24), molmo2-reasonvos (reasoning video object segmentation, 2/27), molmo2-burst (burst detection, 2/23), molmo2-mevis/mevis-valid (motion expression video segmentation), molmo2-ref-davis17/ref-yt-vos (reference-guided tracking), molmo2-revos/vicas/moca/lv-vis (multi-scenario video object segmentation), molmo2-hardcodes (hard-coded samples, 2/25), molmo2-academic-video-points (academic video tracking point labeling, 2/17), Molmo2-VideoPoint (video localization data, 360 downloads), Molmo2-VideoLocalizedNarratives/CaptionHf/VideoMME/TGIF/TVQA/NewsVideoQA (video narrative and QA series). Also released Dolci-Think-SFT-32B (1,464 downloads, reasoning SFT data), Dolci-Instruct-SFT-Tool-Use-SA (tool use SFT data), code_fresh_0825_1225 (25M token code data, 42 languages), SimpleToM (theory of mind evaluation), asta-user-interactions (scientific tool user interaction data). On GitHub, molmo2 repository (197 stars), molmospaces robotics ecosystem (152 stars, +15) continue growing.
Reddit r/LocalLLaMA's hottest post this week "Junyang Lin has left Qwen" (799 votes, 3/3) — the departure of a core Qwen R&D member sparked widespread community discussion. Meanwhile, Qwen 3.5 Small series (0.8B-9B) launched on Product Hunt (3/3), Qwen3.5-35B-A3B downloads surged from 21K last week to 680K, FP8 version hit 330K, 122B-A10B reached 150K, 27B-FP8 reached 159K. Qwen ecosystem continued expanding: Qwen3Guard real-time safety filtering, Qwen-Image-Edit image editing, Qwen-MT multilingual translation, GSPO scalable RL training. Reddit posts on Qwen3.5-9B abliterated (108 votes) and Qwen3.5-9B Uncensored (30 votes) show the community has begun systematically modifying Qwen small models. Tianchi IEEE AICAS 2026 edge VLM deployment challenge continued progressing.
OpenAI released three strategic partnerships this week — Amazon strategic cooperation (Frontier platform on AWS), Microsoft partnership renewal statement, and Department of Defense contract signing. GPT-5.3 Instant and system card released simultaneously (3/3), positioned as "smoother everyday conversation." The DoD contract triggered intense community reaction: LessWrong "A Tale of Three Contracts" deep analysis of Anthropic being flagged as a supply chain risk, "Mass Surveillance w/ LLMs is the Default Outcome" (DoW contract implications), Reddit "DoW vs Anthropic saga proves closed-source safety is a fraud" (64 votes) demanding open safety evaluations. Anthropic's response to Defense Secretary Pete Hegseth's statement drew attention. GitHub codex 61,868 stars (+670), openai-agents-python 19,132 stars.
Together AI released CoderForge-Preview (2/20, 8,413 downloads, 118 likes), currently the largest open-source test-verified coding Agent dataset. Fine-tuned on Qwen-3 32B, SWE-Bench Verified performance improved from 23.0% to 59.4% pass@1, ranking first among open data and second among open-weight models ≤32B. Concurrent Reddit post "Benchmarked 94 LLM endpoints for jan 2026" (54 votes) shows open-source models have closed to within 5 points of closed-source models on quality. Mistral released Devstral 2 and Vibe CLI, strengthening coding Agent toolchains. SWE-rebench V2 (HF Papers) proposed cross-language SWE task scalable collection methods.
Apple Machine Learning Research published "On the Impossibility of Separating Intelligence from Judgment: The Computational Intractability of Filtering for AI Alignment" — arguing from computational complexity theory that AI alignment filtering is theoretically inseparable from intelligence itself, i.e., you cannot perfectly filter harmful outputs without affecting model intelligence. Also released Hallucination Span Detection reasoning, EMBridge gesture EMG cross-modal transfer, UI component variant instantiation, and App Store search LLM enhancement. Google released Gemini 3.1 Flash-Lite (fastest, lowest-cost Gemini 3 series) and Nano Banana 2 image generation model. HN "Open-Source Article 12 Logging for EU AI Act" (35 votes) shows AI compliance tooling is going open-source.
Demand Signals
Infer training data demands from model releases
Download Movers
Datasets with the largest download changes this week
| Dataset | Downloads | Weekly Growth |
|---|---|---|
| nvidia/Nemotron-Terminal-Corpus | 744 | +18500.0% |
| nvidia/HiLiftAeroML | 1,011 | +73.7% |
| google/WaxalNLP | 13,506 | +36.7% |
| allenai/asta-summary-citation-counts | 439 | +13.7% |
| microsoft/SYNUR | 122 | +0.8% |
Deep Dive — DataRecipe
This week's 3 high-value datasets reverse-analyzed (auto-generated by DataRecipe)
Data Structure
Risk Assessment
Data Structure
Risk Assessment
Data Structure
Risk Assessment
Analyzed 3 datasets this week · 99.6% human effort
Want to discuss this issue?
Auto-generated by AI Dataset Radar · Updated weekly
AI Dataset Radar →