Code Agent Race Heats Up
Robotics Data Infrastructure Accelerates
This week scanned 86 HF orgs · 50 GitHub orgs · 71 blogs · 125 X accounts
Code Agent competition heats up, Cosmos-Policy + Numb3rs + Isaac GR00T, document understanding data demand surges. Top data demand signal this week: Code Agent Data.
Key Findings
This week's 5 high commercial value findings
Alibaba's Qwen team released Qwen3-Coder-Next (80B MoE, 3B active), designed for coding agents and local development. SWE-Bench Pro score of 44.3, with Day-0 support from both vLLM and SGLang. Together AI has launched inference service.
NVIDIA released RoboCasa-Cosmos-Policy and LIBERO-Cosmos-Policy, two robotics simulation datasets, alongside Isaac GR00T N1.6 foundation model (GitHub 6,143 stars). Also released Numb3rs speech text normalization dataset. 5 datasets + 35 models, making it the most active lab this week.
deepseek-ai/DeepSeek-OCR-2 reached 661,725 downloads and 712 likes within one week, becoming the most downloaded Chinese model this week. Concurrently, Zhipu (智谱) released GLM-OCR (reported by SGLang), and Mistral released OCR 3.
7 RLHF-related papers this week, covering French preference data collection (compar:IA), democratized preference alignment (DemPO), Rubric improvements, GenRM reasoning quality (R-Align), LLM judge debiasing (FairJudge), DPO over-optimization defense (PEPO), and video flow matching (Euphonium). Qwen released the RationaleRM dataset (2026-02-02), proposing a new Rationale Consistency evaluation dimension.
stepfun-ai/Step-3.5-Flash with 228,406 downloads, also released competitive programming benchmark CF-Div2-Stepfun. Step3-VL-10B (82,755 downloads) focuses on robotics vision-language interaction.
Demand Signals
Infer training data demands from model releases
Download Movers
Datasets with the largest download changes this week
| Dataset | Downloads | Weekly Growth |
|---|---|---|
| nvidia/Numb3rs | 232 | +139.2% |
| amazon/doc_split | 1,566 | +25.9% |
| Qwen/RationaleRM | 754 | +16.9% |
| nvidia/LIBERO-Cosmos-Policy | 2,173 | +7.0% |
| google/WaxalNLP | 7,277 | +1.9% |
Deep Dive — DataRecipe
This week's 3 high-value datasets reverse-analyzed (auto-generated by DataRecipe)
Data Structure
Risk Assessment
Data Structure
Risk Assessment
Data Structure
Risk Assessment
3 datasets analyzed this week · 83.9% human labor share · All Hard difficulty
Want to discuss this issue?
Auto-generated by AI Dataset Radar · Updated weekly
AI Dataset Radar →