ddiddi (Dhruv Diddi)

upvoted 2 collections 4 months ago

VLA Datasets

Collection

5 items • Updated Sep 24, 2025 • 2

Code Reasoning

Collection

7 items • Updated Sep 15, 2025 • 5

upvoted an article 4 months ago

Article

Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm

Jun 11, 2025

•

122

upvoted a paper 6 months ago

Preserving Privacy, Increasing Accessibility, and Reducing Cost: An On-Device Artificial Intelligence Model for Medical Transcription and Note Generation

Paper • 2507.03033 • Published Jul 3, 2025 • 9

upvoted an article 7 months ago

Article

LeRobot Community Datasets: The “ImageNet” of Robotics — When and How?

+5

May 11, 2025

•

88

upvoted 4 collections 7 months ago

upvoted 3 papers 8 months ago

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5, 2025 • 80

Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning

Paper • 2505.01441 • Published Apr 28, 2025 • 39

Real-World Gaps in AI Governance Research

Paper • 2505.00174 • Published Apr 30, 2025 • 12

upvoted a paper 9 months ago

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7, 2025 • 110

upvoted an article 10 months ago

Article

Transformers.js v3: WebGPU Support, New Models & Tasks, and More…

Oct 22, 2024

•

80

upvoted a collection 10 months ago

Gemma 3 Release

Collection

28 items • Updated Aug 11, 2025 • 576

upvoted 5 papers 10 months ago

LocAgent: Graph-Guided LLM Agents for Code Localization

Paper • 2503.09089 • Published Mar 12, 2025 • 13

Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol

Paper • 2503.05860 • Published Mar 7, 2025 • 11

AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models

Paper • 2503.08417 • Published Mar 11, 2025 • 8

"Principal Components" Enable A New Language of Images

Paper • 2503.08685 • Published Mar 11, 2025 • 12

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Paper • 2503.07920 • Published Mar 10, 2025 • 101

Dhruv Diddi

AI & ML interests

Organizations

VLA Datasets

Code Reasoning

Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm

Preserving Privacy, Increasing Accessibility, and Reducing Cost: An On-Device Artificial Intelligence Model for Medical Transcription and Note Generation

LeRobot Community Datasets: The “ImageNet” of Robotics — When and How?

Foundation Text-Generation Models Below 360M Parameters

Frequently Used Spaces

Leaderboards

MedGemma Release

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning

Real-World Gaps in AI Governance Research

One-Minute Video Generation with Test-Time Training

Transformers.js v3: WebGPU Support, New Models & Tasks, and More…

Gemma 3 Release

LocAgent: Graph-Guided LLM Agents for Code Localization

Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol

AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models

"Principal Components" Enable A New Language of Images

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Dhruv Diddi

AI & ML interests

Organizations

ddiddi's activity

Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm

LeRobot Community Datasets: The “ImageNet” of Robotics — When and How?

Transformers.js v3: WebGPU Support, New Models & Tasks, and More…