Radar Brief Week 15, 2026 · 2026-02-26 — 2026-03-05

Allen AI Withdraws 29 Video Tracking Datasets
AI Data Intelligence Weekly

This week scanned 86 HF orgs · 50 GitHub orgs · 71 blogs · 125 X accounts

0
Valuable Datasets
0
Related Papers
0
Blog Posts
0
Active Repos
One-line Summary

Allen AI withdraws 29 video tracking datasets, signaling video understanding data shortage [P0], coding agent trajectory data becomes scarce resource as TogetherAI withdraws CoderForge-Preview dataset [P0], Chinese embodied intelligence dataset BAAI/ToucHD series withdrawn, tactile data emerges as new frontier [P1]. This week's strongest data demand signal: Video Understanding/Tracking Data.

Key Findings

This week's 5 high commercial value findings

P0 Allen AI Withdraws 29 Video Tracking Datasets, Signaling Video Understanding Data Shortage [P0]

Allen AI suddenly withdrew all 29 video datasets from the Molmo2 series on March 5, including core video understanding benchmarks like VideoLocalizedNarratives, VideoMME, and TVQA. These datasets were originally used to train video tracking and understanding capabilities for their multimodal models. Concurrently, NVIDIA added embodied intelligence repositories like Isaac-GR00T (6,321 stars), showing the industry is competing for video-action alignment data.

Commercial significance → Video understanding data has become a strategic resource for AI companies, no longer freely open-sourced. This type of data requires extensive human judgment to label object trajectories, action intentions, and scene understanding. Knowlyr should immediately establish video labeling capabilities, particularly for video-action alignment data for robot training.
P0 Coding Agent Trajectory Data Becomes Scarce Resource, TogetherAI Withdraws CoderForge-Preview Dataset [P0]

TogetherAI withdrew the CoderForge-Preview dataset on March 5, which contained high-quality coding agent execution trajectories. Concurrently, OpenAI released the codex repository (63,080 stars), and Anthropic's claude-code reached 73,813 stars. The paper "A Rubric-Supervised Critic from Sparse Real-World Outcomes" (2026-03-04) proposes learning evaluation models from sparse human interactions.

Commercial significance → Coding agent execution trajectory data is extremely valuable, requiring senior developers to judge code quality and debugging path rationality. This "process data" is more valuable than final results. Knowlyr should develop specialized code review and trajectory labeling tools.
P1 Chinese Embodied Intelligence Dataset BAAI/ToucHD Series Withdrawn, Tactile Data Emerges as New Frontier [P1]

Beijing Academy of Artificial Intelligence (BAAI) withdrew three robot tactile datasets - ToucHD-Force, ToucHD-Mani, and ToucHD-Sim on March 5, 2026. These datasets originally contained force feedback and tactile information from robot manipulation. NVIDIA concurrently released PhysicalAI-Robotics-NuRec and Arena-GR1-Manipulation datasets, showing tactile modality becoming a key bottleneck in embodied intelligence.

Commercial significance → Tactile data collection requires specialized equipment and precise labeling by human experts to judge force, material properties, and operation success. This is a data type that pure algorithms cannot synthesize, creating unique value space for human judgment.
P1 Evaluation Benchmarks Become Key to AI Safety Compliance, Multiple Evaluation Datasets Face Access Restrictions [P1]

EleutherAI withdrew djinn-problems-v0.9 and rh-misalignment-control-sft datasets. NVIDIA's SPEED-Bench and Microsoft's TestExplora evaluation benchmarks were simultaneously withdrawn. The paper "QEDBENCH: Quantifying the Alignment Gap" (2026-02-24) shows academia is establishing stricter model alignment evaluation standards.

Commercial significance → As AI regulation strengthens, evaluation datasets become critical compliance resources. Human experts are needed to judge whether model outputs meet safety and ethical standards, making evaluation data a high-value service.
P2 Synthetic Data Generation Enters "Controllability" Era, JANUS Framework Addresses Four Major Challenges [P2]

The paper "JANUS: Structured Bidirectional Generation" (2026-03-04) proposes a framework that simultaneously addresses Fidelity, Control (logical constraint control), Reliability (uncertainty estimation), and Efficiency (computational efficiency). SuperAnnotate released MCP Server tools supporting AI agents to directly connect labeling projects.

Commercial significance → Even with advances in synthetic data technology, human judgment is still needed to verify the quality and logical consistency of synthetic data, especially in high-risk application scenarios.

Demand Signals

Infer training data demands from model releases

Data Type Intensity Trend Related Signals
Video Understanding/Tracking Data
Extremely High → Continuing
Allen AI withdraws 29 Molmo2 video datasets, NVIDIA Isaac-GR00T gains 6.3K stars
Coding Agent Trajectories
Extremely High ↑ New
TogetherAI withdraws CoderForge-Preview, OpenAI codex reaches 63K stars
Robot Tactile Data
High ↑ New
BAAI withdraws ToucHD series, NVIDIA releases multiple physical AI datasets
Model Alignment Evaluation
High ↑ New
EleutherAI · NVIDIA · Microsoft simultaneously withdraw evaluation benchmarks
Professional Domain Reasoning
High ↑ New
UniSkill paper matches university courses with professional capabilities, DeepResearch-9K released
CAD Design Instructions
Medium ↑ New
Pointer-CAD paper unifies B-Rep and command sequences
Audio-Visual Collaboration
Medium ↑ New
Crab+ paper proposes explicit collaborative scene understanding model
Medical Dialogue Privacy
Medium ↑ New
PrivMedChat paper explores end-to-end differential privacy RLHF
Multimodal Visual Reasoning Data ↓ Dropped Present in previous issue, absent this issue
Coding Agent Data ↓ Dropped Present in previous issue, absent this issue
Safety Assessment/Alignment Data ↓ Dropped Present in previous issue, absent this issue
RLHF/Preference Alignment Data ↓ Dropped Present in previous issue, absent this issue
Agent Tool/Planning Data ↓ Dropped Present in previous issue, absent this issue
Robot/Tactile Data ↓ Dropped Present in previous issue, absent this issue
Synthetic Data Methodology ↓ Dropped Present in previous issue, absent this issue
EU Compliance Assessment Data ↓ Dropped Present in previous issue, absent this issue

Deep Dive — DataRecipe

This week's 3 high-value datasets reverse-analyzed (auto-generated by DataRecipe)

togethercomputer/CoderForge-Preview
300 samples · 7 fields · Hard
6.0/10
🟢 Recommended for replication

Data Structure

trajectory_id finish_reason image messages reward tools license

Risk Assessment

Medium Risk Labeling quality may fluctuate → Establish strict QA processes and set quality thresholds
Low Risk Data may become outdated over time → Establish continuous update mechanisms
allenai/Dolci-Think-SFT-32B
300 samples · 3 fields · Hard
6.0/10
🟢 Recommended for replication

Data Structure

messages id source

Risk Assessment

Medium Risk Labeling quality may fluctuate → Establish strict QA processes and set quality thresholds
Low Risk Data may become outdated over time → Establish continuous update mechanisms
google/MapTrace
300 samples · 3 fields · Medium
6.5/10
🟢 Recommended for replication

Data Structure

image input label

Risk Assessment

Medium Risk Labeling quality may fluctuate → Establish strict QA processes and set quality thresholds
Low Risk Data may become outdated over time → Establish continuous update mechanisms

This week analyzed 3 datasets · 99.6% human involvement

Want to discuss this issue?

Kai
Kai Founder & CEO
苏文
苏文 AI Documentation & Release Engineer
陆明哲
陆明哲 AI Product Manager

Auto-generated by AI Dataset Radar · Updated weekly

AI Dataset Radar →