W19 AI Data Intelligence

One-line Summary

Model + data + benchmark + mobile deployment [P0]; Google DeepMind launched the full Gemma 4 family on 2026-03-11, and by 2026-04-08 the open ecosystem was already rebuilding downstream data stacks around it [P0]; Anthropic led Project Glasswing to form a cross-industry AI safety alliance and updated Claude API release notes on 2026-04-09 [P1]. Strongest data-demand signal this week: in-the-wild 3D detection and stereo-depth data.

Key Findings

This week's 5 high commercial value findings

P0 From 2026-04-04 to 2026-04-07, Allen AI turned WildDet3D into a full in-the-wild 3D detection stack: model + data + benchmark + mobile [P0]

Starting on 2026-04-04, Allen AI released the WildDet3D series in rapid succession: on 2026-04-04 it launched the `allenai/WildDet3D` model (open-vocabulary monocular 3D detection, currently 34 downloads and 12 likes); on 2026-04-05 it released `allenai/WildDet3D-Data` (a training set with human-reviewed 3D bounding box annotations); on 2026-04-06 it released `allenai/WildDet3D-Stereo4D-Bench` (a stereo-depth ground-truth benchmark generated from Stereo4D videos); and on 2026-04-07 alone it shipped `allenai/WildDet3D-visualization-source` (already at 4,177 downloads), `allenai/WildDet3D-Bench` (2,470 validation images, 9,256 annotations, 785 classes, with a hidden test evaluation set), and the `allenai/WildDet3D-iPhone` model. Allen AI's official blog simultaneously published “Introducing WildDet3D: Open-world 3D detection from a single image.”

Business implications → Allen AI is fully repeating the MolmoWeb playbook: not releasing a single dataset or a single model, but a complete stack of "open model + human-verified training data + public val + hidden test + mobile variant." For data service companies, the truly scarce nodes in this production line are human review for 3D bounding boxes across 785 in-the-wild classes, monocular-to-stereo depth alignment, and annotation feedback loops from real iPhone scenes. Monocular 3D detection used to be largely confined to autonomous driving; Allen AI is now pushing it into the general-purpose "any RGB image → 3D" setting, which should materially lift demand for judgment-intensive 3D annotation.

P0 Google DeepMind launched the full Gemma 4 family on 2026-03-11, and by 2026-04-08 the open ecosystem was already rebuilding downstream data stacks around it [P0]

`google/gemma-4-31B-it` now has 1,589,761 downloads and 1,595 likes; `google/gemma-4-26B-A4B-it` has 1,269,031 downloads and 575 likes; and `google/gemma-4-E4B-it` and `google/gemma-4-E2B-it` have 961,135 and 646,063 downloads respectively. The whole family ships through image-text-to-text or any-to-any pipelines. DeepMind's blog post, “Gemma 4: Byte for byte, the most capable open models,” emphasizes Gemma 4's fit for reasoning and agentic workflows. NVIDIA simultaneously published “From RTX to Spark: NVIDIA Accelerates Gemma 4 for Local Agentic AI,” positioning Gemma 4 as an early carrier for RTX-native Agents, while Hugging Face published its own companion post, “Welcome Gemma 4.”

Business implications → Gemma 4 is the biggest open release of the week, but the real data signal is not the model itself. The downstream impact is what matters: any team trying to run SFT, preference alignment, or Agent capability injection on Gemma 4 immediately faces the question of where to source matching multimodal, reasoning, and tool-use corpora. That is the wedge for Knowlyr and expert data services: not pretraining corpora, but post-training evaluation sets, tool-call trajectories, and multimodal preference samples at Gemma 4 scale.

P1 Anthropic led Project Glasswing to form a cross-industry AI safety alliance and updated Claude API release notes on 2026-04-09 [P1]

Anthropic's news page launched Project Glasswing with participants including AWS, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, making it one of the most comprehensive AI safety alliances to date. During the same period, Anthropic Research published “Emotion concepts and their function in a large language model.” The `Anthropic/EconomicIndex` dataset has now reached 13,125 downloads and 498 likes, continuing to grow from the previous issue. Claude API also published a new release-notes update on 2026-04-09.

Business implications → Big companies are turning "AI safety + usage-data governance" from isolated research efforts into infrastructure-level coalition building. For data service firms, that means the bar for red teaming, user-behavior auditing, and compliance annotation will become both higher and more centralized. Economic Index-style datasets—real traces of how AI is being allocated into actual economic activity—are increasingly becoming a standalone asset class.

P1 NVIDIA Physical AI and LeRobot continue to lead embodied-data demand, with downloads accelerating again this week [P1]

`nvidia/PhysicalAI-Robotics-Open-H-Embodiment` now has 72,898 downloads and 18 likes, up 42.9% from 51,101 on 2026-03-28; `nvidia/PhysicalAI-Autonomous-Vehicles` has reached 1,006,425 downloads and 826 likes; and `nvidia/SEED-Timeline-Annotations` (BONES-SEED human-motion temporal annotations) was released in parallel. The LeRobot ecosystem also shipped `lerobot/droid_1.0.1`, `lerobot/openarms-hardware-modifications`, and `OpenDriveLab/WorldEngine` this week, with droid_1.0.1 directly pointing back into the main LeRobot ecosystem. NVIDIA Robotics Blog used National Robotics Week to publish “Latest Physical AI Research, Breakthroughs and Resources.” AGIBOT also disclosed its GO-2 foundation model and Genie Sim 3.0 simulation platform in Robot Report.

Business implications → Since 2026-02, the robotics-data line has shown no signs of cooling off, with real embodied-data weekly growth still running in the 30%–50% range. That means synthetic data and simulation have not replaced real demonstrations; instead, they are raising the unit value of "real demonstrations + verifiable action boundaries + continuous temporal annotations." For Knowlyr, robot teleoperation, demonstration segmentation and review, and sim-to-real validation samples remain the embodied-data bets most worth making.

P2 This week's papers point to one conclusion: reward models are being forced to evolve, while controllable synthetic data is becoming the new path [P1]

This week, arXiv and Hugging Face Papers saw a concentrated wave of reward-model papers: on 2026-04-08, `ConsistRM: Improving Generative Reward Models via Consistency-Aware Self-Training`; on 2026-04-08, `ReflectRM: Boosting Generative Reward Models via Self-Reflection within a Unified Judgment Framework`; on 2026-04-07, `VL-MDR: Dynamic Dimension Selection and Aggregation for Interpretable Vision-Language Reward Modeling`; on 2026-04-09, `ProMedical: Hierarchical Fine-Grained Criteria Modeling for Medical LLM Alignment via Explicit Injection`; on 2026-04-06, `SenseAI: A Human-in-the-Loop Dataset for RLHF-Aligned Financial Sentiment Reasoning`; and on 2026-04-09, `Aligning Agents via Planning: A Benchmark for Trajectory-Level Reward Modeling`. The two most important additions were `Synthetic Data for any Differentiable Target` on 2026-04-09—which introduces Dataset Policy Gradient (DPG) and treats synthetic-data generation as an optimizable differentiable target—and `Structured Distillation of Web Agent Capabilities Enables Generalization` on 2026-04-09.

Business implications → Academia delivered two tightly aligned signals this week. First, reward models are no longer just scalar scorers; they are moving toward self-consistency, multi-dimensional decomposition, and interpretability. Second, synthetic data is starting to be treated as a controllable differentiable target rather than something produced by prompt trial and error. These trends cut in opposite directions for data companies: reward-model upgrades should raise the value of high-dimensional human judgment, while DPG-style methods will keep compressing the marginal value of commodity synthetic data. Knowlyr should continue using "human judgment + verifiable feedback" to own the last mile that synthetic data cannot reliably cover.

Demand Signals

Infer training data demands from model releases

In-the-wild 3D detection and stereo-depth data

Extremely high ↑ New

Allen AI released 5+ WildDet3D assets within one week, separating val and test while emphasizing human review

Embodied teleoperation and robot demonstration data

Extremely high ↑ New

Open-H-Embodiment reached 72,898 downloads while the LeRobot ecosystem expanded in parallel

Long-video and temporal action annotation

Extremely high ↑ New

SEED-Timeline-Annotations and BONES-SEED continue expanding; Meta gistbench highlights long-horizon user understanding

Agentic coding / terminal Agent trajectories

Extremely high ↑ New

GLM-5 reached open-source SOTA on Terminal Bench 2.0; Arcee Trinity-Large-Thinking emphasizes tool calling

Multi-dimensional reward-model training data

Extremely high ↑ New

ConsistRM / ReflectRM / VL-MDR / ProMedical / SenseAI all landed in the same week

Controllable synthetic-data generation recipes

High ↑ New

DPG: `Synthetic Data for any Differentiable Target` treats synthetic data as an optimizable differentiable objective

Multimodal Agent capability-distillation data

High ↑ New

`Structured Distillation of Web Agent Capabilities Enables Generalization`

Enterprise multilingual speech-foundation data

High ↑ New

VoxCPM 2 supports 30 languages plus 9 major dialect groups; Mistral Voxtral TTS; Deepgram integrates Together

Economic Index-style real usage traces

Medium ↑ New

Anthropic EconomicIndex reached 13,125 downloads and continues to behave like a standalone asset

Medical/financial domain RLHF gold annotations

Medium ↑ New

ProMedical + SenseAI point toward high-value vertical human-in-the-loop data services

Web action trajectory data ↓ Dropped Present in previous issue, absent this issue

GUI grounding / screen parsing data ↓ Dropped Present in previous issue, absent this issue

Computer Use continuous video demonstrations ↓ Dropped Present in previous issue, absent this issue

Code Agent / terminal post-training corpora ↓ Dropped Present in previous issue, absent this issue

Robot teleoperation and embodied demonstration data ↓ Dropped Present in previous issue, absent this issue

Long-video reasoning and long-horizon multimodal data ↓ Dropped Present in previous issue, absent this issue

Implicit preference and real-world usage feedback data ↓ Dropped Present in previous issue, absent this issue

Multi-domain mixed SFT data ↓ Dropped Present in previous issue, absent this issue

Visual Agent benchmark ↓ Dropped Present in previous issue, absent this issue

Speech-to-execution pipeline data ↓ Dropped Present in previous issue, absent this issue

Download Movers

Datasets with the largest download changes this week

Dataset	Downloads	Weekly Growth
allenai/MolmoWeb-SyntheticTrajs	1,159	+155.3%
allenai/MolmoWeb-HumanTrajs	769	+92.7%
nvidia/PhysicalAI-Robotics-Open-H-Embodiment	72,898	+42.7%
Anthropic/EconomicIndex	13,125	+11.9%
google/WaxalNLP	11,831	-14.9%

Want to discuss this issue?

Kai Founder & CEO

苏文 AI Documentation & Release Engineer

陆明哲 AI Product Manager

Auto-generated by AI Dataset Radar · Updated weekly

AI Dataset Radar →

Allen AI Turns In-the-Wild 3D Detection into a Full StackGemma 4 Reshapes the Open Agent Ecosystem

Key Findings

Demand Signals

Download Movers

Allen AI Turns In-the-Wild 3D Detection into a Full Stack
Gemma 4 Reshapes the Open Agent Ecosystem