Code Agent Race Heats Up
Robotics Data Infrastructure Accelerates
This week scanned 86 HF orgs · 50 GitHub orgs · 71 blogs · 125 X accounts
Code Agent competition intensifies, Cosmos-Policy + Numb3rs + Isaac GR00T, document understanding data demand surges. Strongest data demand signal this week: Code Agent Data.
Key Findings
This week's 5 high commercial value findings
Alibaba's Qwen team released Qwen3-Coder-Next (80B MoE, 3B active), purpose-built for coding agents and local development. Scored 44.3 on SWE-Bench Pro, with Day-0 support from both vLLM and SGLang. Together AI has already launched inference services.
NVIDIA released two robotics simulation datasets — RoboCasa-Cosmos-Policy and LIBERO-Cosmos-Policy — alongside the Isaac GR00T N1.6 foundation model (GitHub ⭐6143). Also released the Numb3rs speech text normalization dataset. With 5 datasets + 35 models, NVIDIA was the most active lab this week.
deepseek-ai/DeepSeek-OCR-2 reached 661,725 downloads and 712 likes within one week, becoming the most downloaded Chinese model this week. Meanwhile, Zhipu released GLM-OCR (covered by SGLang), and Mistral released OCR 3.
Seven RLHF-related papers this week, covering French preference data collection (compar:IA), democratized preference alignment (DemPO), rubric improvements, GenRM reasoning quality (R-Align), LLM judge debiasing (FairJudge), DPO over-optimization safeguards (PEPO), and video flow matching (Euphonium). Qwen released the RationaleRM dataset (2026-02-02), proposing Rationale Consistency as a new evaluation dimension.
stepfun-ai/Step-3.5-Flash achieved 228,406 downloads, alongside the release of the competitive programming benchmark CF-Div2-Stepfun. Step3-VL-10B (82,755 downloads) focuses on robotic vision-language interaction.
Demand Signals
Infer training data demands from model releases
Download Movers
Datasets with the largest download changes this week
| Dataset | Downloads | Weekly Growth |
|---|---|---|
| nvidia/Numb3rs | 232 | +139.2% |
| amazon/doc_split | 1,566 | +25.9% |
| Qwen/RationaleRM | 754 | +16.9% |
| nvidia/LIBERO-Cosmos-Policy | 2,173 | +7.0% |
| google/WaxalNLP | 7,277 | +1.9% |
Deep Dive — DataRecipe
This week's 3 high-value datasets reverse-analyzed (auto-generated by DataRecipe)
Data Structure
Risk Assessment
Data Structure
Risk Assessment
Data Structure
Risk Assessment
Analyzed 3 datasets this week · 83.9% human effort · all Hard difficulty
Want to discuss this issue?
Auto-generated by AI Dataset Radar · Updated weekly
AI Dataset Radar →