NVIDIA GR00T N1 humanoid robotics foundation models for embodied AI, manipulation, and robot learning
Karsten Kuhnke PRO
mindchain
AI & ML interests
Industry Grade Humanoid Synthetic Motion Data Generation, Mechanistic Interpretability Data Generation, Sparse Autoencoders, Edge IOT, Gemma Scope 2, RLHF, Edge AI, Alpa SIM, Alpamayo-R1, Cosmos, Isaac SIM, Isaac LAB, GR00T N1.6, Unreal Engine
Recent Activity
replied to
their
post
about 1 hour ago
Claude Code Self & Continual Learning
Hey everyone! 👋
30 GitHub Stars in 4 Days - Thank You!
I'm really grateful for the positive response to the Claude Reflect System. In just 4 days, 30 developers have shown interest by starring the project. Thank you so much!
What Is Claude Reflect?
Correct once, never again. Claude Reflect helps Claude Code remember your corrections and preferences across sessions. Instead of repeating the same feedback, the system learns and applies it automatically.
Main Features:
🧠 Learning System
- Detects corrections and preferences from conversations
- Stores them permanently in skill files
- Applies learnings in future sessions
🔒 Safety First
- Automatic backups before changes
- YAML validation
- Git version control
⚡ Two Modes
- Manual: Run /reflect when you want
- Auto: Reflects automatically at session end
How It Works
If you correct Claude to use pytest instead of unittest, this preference gets saved. Next time, Claude will remember and use pytest automatically. It's that simple.
Getting Started
1. Clone the repository
2. Install dependencies
3. Activate the skill
4. Try it out!
The python-project-creator example shows how the system learns from your feedback.
Give It a Try
https://github.com/haddock-development/claude-reflect-system
Feel free to check it out, give feedback, or contribute. Every bit of input helps improve the project!
Thank you so much for your support!
---
#ClaudeCode #AI #MachineLearning #ContinualLearning #OpenSource #Developer #Coding #Python #Productivity #DevTools #GitHub #SoftwareDevelopment #Programming #AIAssistant #DeveloperTools #CodeQuality #Tech
Feel free to give it a try by yourself.
https://github.com/haddock-development/claude-reflect-system
updated
a collection
about 5 hours ago
Graphics AI - Visual Computing & Image Synthesis
updated
a collection
about 5 hours ago
Graphics AI - Visual Computing & Image Synthesis
Organizations
NVIDIA Nemotron PII - Privacy & Data Protection Dataset
NVIDIA Nemotron PII dataset for personally identifiable information detection and privacy-aware NLP
Unitree G1 Dex1 - Humanoid Robot Dexterity Datasets
Unitree G1 humanoid robot Dex1 dexterity datasets with mounted camera for manipulation learning
Unitree G1 BrainCo - Grasping & Manipulation Data
Unitree G1 humanoid robot BrainCo datasets for grasping, object manipulation, and dexterous hand training
-
unitreerobotics/G1_Brainco_GraspOreo_Dataset
Viewer • Updated • 235k • 622 -
unitreerobotics/G1_Brainco_GraspRubiksCube_Dataset
Viewer • Updated • 221k • 632 • 1 -
unitreerobotics/G1_Brainco_PickApple_Dataset
Viewer • Updated • 154k • 153 -
unitreerobotics/G1_Brainco_PickCharger_Dataset
Viewer • Updated • 217k • 613
Hugging Face - LeRobot - Pi0 (Old Version)
Hugging Face LeRobot Pi0 legacy version - archived robotics model for reference and compatibility
Hugging Face - LeRobot - Open X-Embodiment
Open X-Embodiment robotics datasets: cross-platform robot learning for DROID, Kuka, TACO, JACO
LeRobot Pi0 - HuggingFace Robotics Foundation Model
Hugging Face LeRobot Pi0 foundation model for robotics: manipulation, navigation, and embodied AI
LeRobot XVLA - Cross-Embodiment Vision-Language-Action
Hugging Face LeRobot XVLA cross-embodiment vision-language-action models for universal robot control
Hyper Graph Reasoning - Knowledge Graphs for AI Agents
Higher-order knowledge representations and hypergraph reasoning for agentic AI and scientific discovery
NVIDIA Nemotron Orchestrator - Multi-Model Routing
NVIDIA Nemotron Orchestrator 8B for multi-model coordination, task routing, and agentic workflows
Meta RoBERTa - Pretrained NLP & Text Classification
Meta RoBERTa pretrained language models for NLP tasks: classification, NER, sentiment analysis
-
FacebookAI/roberta-base
Fill-Mask • 0.1B • Updated • 9.72M • • 551 -
FacebookAI/xlm-roberta-large-finetuned-conll03-german
Token Classification • Updated • 4.68k • • 14 -
FacebookAI/xlm-roberta-large-finetuned-conll02-spanish
Fill-Mask • 0.6B • Updated • 38 • 2 -
FacebookAI/xlm-roberta-large
Fill-Mask • 0.6B • Updated • 3.49M • • 487
NVIDIA Physical AI - Autonomous Vehicles & Robotics
NVIDIA Physical AI models for robotics, embodied intelligence, and real-world interaction
Qwen3 VL Reranker - Multimodal RAG Ranking Models
Qwen3 Vision-Language reranker models for RAG pipelines and multimodal document retrieval
Facebook/Meta - Research Plan Dataset
Meta Research Plan datasets for AI research planning, scientific reasoning, and agent workflows
NVIDIA Clara Medical - Healthcare & Clinical NLP
NVIDIA Clara medical AI models for healthcare, clinical NLP, and medical imaging analysis
NVIDIA Clara Molecular - Drug Discovery & Chemistry
NVIDIA Clara molecular models for drug discovery, molecular property prediction, and computational chemistry
Nvidia Nemotron RAG - Reranking
NVIDIA Nemotron reranking models for RAG pipelines, search result optimization, and document ranking
NVIDIA Alpamayo-R1 - Reasoning & Physical AI Models
NVIDIA Alpamayo-R1 reasoning model for complex problem solving, mathematical reasoning, and chain-of-thought
-
nvidia/Alpamayo-R1-10B
Robotics • 11B • Updated • 23.9k • 302 -
nvidia/PhysicalAI-Autonomous-Vehicles
Updated • 155k • 693 -
nvidia/PhysicalAI-Autonomous-Vehicles-NuRec
Updated • 9.51k • 110 -
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail
Paper • 2511.00088 • Published • 3
NVIDIA Nemotron Speech - ASR & Text-to-Speech
NVIDIA Nemotron speech models for ASR, text-to-speech, and voice AI applications
-
nvidia/nemotron-speech-streaming-en-0.6b
Automatic Speech Recognition • Updated • 4.94k • 388 -
nvidia/parakeet-tdt-0.6b-v3
Automatic Speech Recognition • Updated • 71.6k • 552 -
nvidia/parakeet_realtime_eou_120m-v1
Updated • 588 • 106 -
nvidia/multitalker-parakeet-streaming-0.6b-v1
Automatic Speech Recognition • Updated • 624 • 63
OpenAI GPT-OSS - Steering Vectors & SAE Research
Open-source GPT models with steering vectors for controllable generation and behavior modification
NVIDIA Cosmos Reason 2 - World Model Reasoning
NVIDIA Cosmos 2 Reason models for world model reasoning, physics simulation, and causal understanding
NVIDIA Cosmos 2 - Cosmos-Predict 2.5
NVIDIA Cosmos 2.5 Predict models for world simulation, future frame prediction, and physical AI
Edge & Smartphone - On-Device Mobile AI Models
On-device AI models optimized for smartphone deployment: mobile LLMs, edge inference, and efficient architectures
NVIDIA Nemotron Safety - AI Alignment Datasets
NVIDIA Nemotron safety datasets for AI alignment, content moderation, and responsible AI training
-
nvidia/Nemotron-AIQ-Agentic-Safety-Dataset-1.0
Viewer • Updated • 10.8k • 1.26k • 10 -
nvidia/Nemotron-Content-Safety-Reasoning-Dataset
Preview • Updated • 72 • 5 -
nvidia/Aegis-AI-Content-Safety-Dataset-2.0
Viewer • Updated • 33.4k • 2.99k • 72 -
nvidia/Nemotron-Content-Safety-Audio-Dataset
Viewer • Updated • 1.93k • 1.34k • 3
NVIDIA Nemotron VLM - Vision-Language Training Data
NVIDIA Nemotron vision-language datasets for multimodal training, image understanding, and VLM finetuning
Deep Thinking - Extended Chain-of-Thought Reasoning
Deep thinking and reasoning models for extended chain-of-thought, deliberative alignment, and complex problem solving
Small Coders - Lightweight Code Generation Models
Lightweight code generation models for edge deployment, IDE integration, and fast code completion
YOLO - Real-Time Object Detection Models
YOLO object detection models for real-time computer vision, autonomous systems, and video analytics
Affordable Coding APIs - Cost-Effective LLM Endpoints
Cost-effective coding API providers and affordable LLM endpoints for development and prototyping
RLM - Neuro-Symbolic Architecture - Reasonig Traces
Inference Wrapper - Models: Root LLM (The Architect) + Python REPL (The Engine) + Sub LLMs (The Workers) Spaned by querys
NVIDIA Nemotron Post-Training - RLHF & SFT Data
NVIDIA Nemotron post-training datasets for RLHF, instruction tuning, and alignment fine-tuning
NVIDIA Nemotron Pre-Training - Foundation Model Data
NVIDIA Nemotron pre-training datasets for large language model training and foundation model development
Topological Transformer - Deepseek
: Manifold-Constrained Hyper-Connections
Deep Research - Autonomous AI Literature Review
Deep research AI agents and models for autonomous literature review, scientific reasoning, and knowledge synthesis
PP-StructureV3 - Document Analysis & Table OCR
PaddlePaddle PP-StructureV3 for document analysis, table recognition, and intelligent document processing
Circuit Sparsity - Neural Network Interpretability
Circuit sparsity research for neural network pruning, mechanistic interpretability, and efficient model compression
Text to Motion - Human Animation & Gesture AI
Text-to-motion generation models for human animation, gesture synthesis, and motion capture AI
TTS - Text-to-Speech & Voice Synthesis Models
Text-to-Speech models for voice synthesis, neural TTS, and natural language audio generation
Audio Segmenting - Meta SAM 3 Audio
Audio segmentation models based on Meta SAM architecture for sound separation and audio understanding
NVIDIA Nemotron V3 - Post-Training Datasets
Mamba/Transformers Combo Hybride
Open Source AI - Fully Open Weights & Training Data
Fully open-source AI models with permissive licenses for commercial use and research
Image to 3D - Single-Image 3D Reconstruction
Image-to-3D generation models for single-image 3D reconstruction, mesh generation, and 3D asset creation
Deep Research Agents - Specialized Search & Reasoning
Specialized deep research models for domain-specific scientific reasoning and literature analysis
IBM Granite - Enterprise AI & Code Generation
IBM Granite foundation models for enterprise AI, code generation, and multilingual NLP tasks
Small OCR - Lightweight Text Recognition for Edge
Lightweight OCR models for edge deployment, mobile text recognition, and efficient document processing
Hierarchical RL - Multi-Level Decision Making
Hierarchical reinforcement learning models for multi-level decision making and complex task decomposition
Bread & Butter - Top Production-Ready LLMs 2025
Top LLMs 2025: ZAI GLM-4.7 (358B) & Moonshot Kimi-K2-Thinking. Next-gen reasoning, code, multilingual. State-of-the-art performance. Production-ready.
Haddock Custom Sparse Autodecoders
Custom JumpReLU Sparse Autoencoders for mechanistic interpretability. T5Gemma-2 SAEs across all layers. AI safety & interpretability research.
Nvidia Nemo-Gym
NVIDIA Nemotron RL datasets for AI agent training. Web search, workplace tasks, instruction following, structured outputs. RLHF & alignment research.
-
nvidia/Nemotron-RL-knowledge-web_search-mcqa
Viewer • Updated • 2.93k • 267 • 7 -
nvidia/Nemotron-RL-agent-workplace_assistant
Viewer • Updated • 1.8k • 239 • 12 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 165 • 9 -
nvidia/Nemotron-RL-instruction_following-structured_outputs
Viewer • Updated • 9.95k • 207 • 25
Trained
Custom-trained models by mindchain: reward models & SAEs. Haddock Reward Mini (8B), function-specific SAEs. AI interpretability & alignment research.
Auto Decoders
JumpReLU SAEs for Gemma 3 interpretability. EleutherAI models, DeepSeek-R1, Pythia SAEs. Mechanistic interpretability & AI safety research.
Google FunctionGemma (Gemma 3)
Function calling: Google FunctionGemma-270m-IT & Mobile Actions dataset (9.65k). Efficient tool use in small LMs. AI agent development.
Google TranslateGemma - 55 Language Translation Models
Google TranslateGemma multilingual translation models supporting 55 languages for neural machine translation and cross-lingual NLP
Unitree Z1 Arm - Dual Dexterity Manipulation Data
Unitree Z1 robotic arm datasets for manipulation learning, grasp planning, and arm control training
-
unitreerobotics/Z1_Dual_Dex1_CleanupPencils_Dataset
Viewer • Updated • 133k • 374 • 2 -
unitreerobotics/Z1_Dual_Dex1_FoldClothes_Dataset
Viewer • Updated • 293k • 286 • 2 -
unitreerobotics/Z1_Dual_Dex1_PourCoffee_Dataset
Viewer • Updated • 443k • 640 • 1 -
unitreerobotics/Z1_Dual_Dex1_StackBox_Dataset
Viewer • Updated • 117k • 387 • 3
Unitree Robotics - G1_Dex3_datasets
Unitree G1 humanoid robot Dex3 advanced dexterity datasets for fine-grained manipulation tasks
-
unitreerobotics/G1_Dex3_BlockStacking_Dataset
Viewer • Updated • 281k • 1.02k • 2 -
unitreerobotics/G1_Dex3_CameraPackaging_Dataset
Viewer • Updated • 256k • 746 -
unitreerobotics/G1_Dex3_GraspSquare_Dataset
Viewer • Updated • 281k • 1.02k -
unitreerobotics/G1_Dex3_ObjectPlacement_Dataset
Viewer • Updated • 98.3k • 802 • 4
Unitree UnifoLM WMA - World Model Agent for Robotics
Unitree UnifoLM World Model Agent for robot learning, action prediction, and embodied AI planning
LeRobot Pi0.5 - Robotics Foundation Model v0.5
Hugging Face LeRobot Pi0.5 intermediate robotics model with improved action generation capabilities
LeRobot SmolVLA - Compact Vision-Language-Action
Hugging Face LeRobot SmolVLA compact vision-language-action model for efficient robot control
Hugging Face - LeRobot - Behavior 1K
Hugging Face LeRobot Behavior-1K large-scale robotics benchmark for diverse manipulation tasks
Atlas RL - Intelligent Architecture Reinforcement Learning
Atlas Reinforcement Learning - How to put the Intelligence into the Architecture?!
Dual RTX 6000 Build - 96GB VRAM Optimized LLMs
Optimized LLMs for dual NVIDIA RTX 6000 GPU setup - 96GB VRAM configurations for local inference
LeRobot Pi0Fast - Real-Time Robotics Inference
Hugging Face LeRobot Pi0Fast optimized robotics models for real-time inference and fast action generation
Google Embedding Gemma - Text Embeddings for RAG
Google Embedding Gemma models for semantic search, RAG applications, and text embeddings
Nvidia Thor + Rasberry + Oak 4D Dual Build
NVIDIA Thor SoC with Raspberry Pi and OAK-4D stereo camera for edge robotics and embodied AI
Qwen3 VL Embeddings - Multimodal Vector Search
Qwen3 Vision-Language embedding models for multimodal RAG, semantic search, and vector databases
NVIDIA Nemotron Content Safety - Toxicity Detection
NVIDIA Llama Nemotron content safety models for toxicity detection and safe AI deployment
NVIDIA Clara Biology - Genomics & Protein AI
NVIDIA Clara biology models for genomics, protein structure, and computational biology research
NVIDIA Clara Medical - Clinical AI & Radiology
NVIDIA Clara medical AI for clinical NLP, radiology analysis, and healthcare decision support
NVIDIA Nemotron Embeddings - RAG & Vector Search
NVIDIA Nemotron embedding models for RAG, semantic search, and vector database applications
-
nvidia/llama-nemotron-embed-vl-1b-v2
Feature Extraction • 2B • Updated • 1.98k • 14 -
nvidia/llama-nemotron-embed-1b-v2
Feature Extraction • 1B • Updated • 20.8k • 30 -
nvidia/llama-embed-nemotron-8b
Feature Extraction • 8B • Updated • 470k • 120 -
nvidia/NV-Embed-v2
Feature Extraction • 8B • Updated • 24.5k • 498
DiT - Diffusion Transformer for Video & Audio Gen
Diffusion Transformer models for multimodal video and audio generation, synthesis, and editing
-
Lightricks/LTX-2
Image-to-Video • Updated • 1.36M • • 1.07k -
Masked Audio Generation using a Single Non-Autoregressive Transformer
Paper • 2401.04577 • Published • 44 -
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 120 -
YOLO-World: Real-Time Open-Vocabulary Object Detection
Paper • 2401.17270 • Published • 43
NVIDIA Nemotron Cascade - Multi-Stage LLM Inference
NVIDIA Nemotron Cascade for multi-stage inference, model routing, and efficient LLM deployment
-
nvidia/Nemotron-Cascade-8B
Text Generation • 8B • Updated • 5.96k • 55 -
nvidia/Nemotron-Cascade-8B-Thinking
Text Generation • 8B • Updated • 1.11k • 34 -
nvidia/Nemotron-Cascade-14B-Thinking
Text Generation • 15B • Updated • 2.67k • 65 -
nvidia/Nemotron-Cascade-8B-Intermediate-ckpts
Text Generation • Updated • 10
Google Gemma 3 LiteRT - Mobile & Edge Optimized
Google Gemma 3 LiteRT models optimized for TensorFlow Lite runtime and mobile edge deployment
-
google/functiongemma-270m-it
Text Generation • 0.3B • Updated • 89.8k • 817 -
litert-community/embeddinggemma-300m
Sentence Similarity • Updated • 1.26k • 31 -
google/gemma-3n-E2B-it-litert-lm
Text Generation • Updated • 19.9k • 274 -
google/gemma-3n-E4B-it-litert-lm
Text Generation • Updated • 20.7k • 282
NVIDIA Cosmos Transfer 2.5 - Style & Domain Transfer
NVIDIA Cosmos 2.5 Transfer models for domain adaptation, style transfer, and video generation
Robotics - Foundation Models for Embodied AI
Robotics foundation models, datasets, and research for embodied AI, manipulation, and autonomous systems
NVIDIA NeMo Gym - RL Agent Training Datasets
NVIDIA Nemotron reinforcement learning datasets from NeMo Gym for agent training and RLHF
-
nvidia/Nemotron-RL-knowledge-web_search-mcqa
Viewer • Updated • 2.93k • 267 • 7 -
nvidia/Nemotron-RL-agent-workplace_assistant
Viewer • Updated • 1.8k • 239 • 12 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 165 • 9 -
nvidia/Nemotron-RL-instruction_following-structured_outputs
Viewer • Updated • 9.95k • 207 • 25
NVIDIA Nemotron RAG Datasets - Retrieval Training
NVIDIA Nemotron RAG datasets for retrieval-augmented generation, document QA, and knowledge grounding
Google Gemma 3N - Mobile multimordal Edition
Google Gemma 3N mobile multimodal models for on-device vision-language tasks and efficient edge deployment
Small Thinking - Compact Reasoning Models for Edge
Compact reasoning models for efficient chain-of-thought inference on resource-constrained devices
Self-Correcting Delta Transformer - Adaptive LLMs
Self-Correcting Delta Transformer - DDL provides the Hardware mechanism (The Erazor), NL solves the software problem.
Meta SAM - Segment Anything Models (Image & Audio)
Meta SAM Segment Anything models for zero-shot image segmentation, object detection, and visual understanding
Edge LLMs - Ultra-Compact High-Performance Models
Ultra-compact LLMs for edge deployment: sub-1B parameter models with strong performance for IoT and mobile
NVIDIA Nemotron Personas - Regional Character Data
NVIDIA Nemotron persona datasets for character AI, personality modeling, and conversational agent training
NVIDIA Nemotron Reward - RLHF & Alignment Models
LLM as a judge
-
nvidia/Llama-3.3-Nemotron-70B-Reward-Principle
Text Generation • 71B • Updated • 94 • 6 -
nvidia/Qwen3-Nemotron-32B-GenRM-Principle
Text Generation • 33B • Updated • 312 • 11 -
nvidia/Qwen3-Nemotron-32B-RLBFF
Text Generation • 33B • Updated • 56 • 27 -
nvidia/Qwen3-Nemotron-8B-BRRM
Text Generation • Updated • 137 • 8
Embeddings - Semantic Search & RAG Vector Models
Text and multimodal embedding models for semantic search, RAG pipelines, and vector similarity applications
Edge Translation - On-Device Multilingual NLP
Edge-optimized translation models for on-device multilingual NLP and low-latency language translation
Qwen Long Reasoning - Extended Context CoT Models
Qwen long-context reasoning models for extended chain-of-thought, complex problem solving, and mathematical reasoning
OCR Models - Optical Character Recognition & Text Extraction
Optical Character Recognition models for text extraction, document digitization, and scene text detection
-
PaddlePaddle/PaddleOCR-VL
Image-Text-to-Text • 1.0B • Updated • 12.5k • 1.49k -
baidu/ERNIE-4.5-0.3B-Paddle
Text Generation • 0.4B • Updated • 76 • 19 -
baidu/ERNIE-4.5-21B-A3B-Paddle
Text Generation • 22B • Updated • 63 • 13 -
ibm-granite/granite-docling-258M
Image-Text-to-Text • 0.3B • Updated • 212k • 1.09k
IQuest LoopCoder - Iterative Code Generation Models
Iquenst LoopCoder models for iterative code generation, self-refinement, and automated debugging
ASR Models - Automatic Speech Recognition & Transcription
Automatic Speech Recognition models for transcription, voice AI, and multilingual speech-to-text
Mobile App AI - On-Device Agents & Function Calling
Mobile app engine models for on-device AI, app development automation, and mobile-first ML
Hybrid Attention - Efficient Transformer Architectures
Hybrid attention models combining local and global attention for efficient long-context processing
Datasets Pretraining - Nemotron V3
Mamba/Transformers Combo Hybride
Byte Level Models - Tokenizer-Free Language Models
Byte-level language models for tokenizer-free NLP, multilingual text, and raw byte processing
Video Analysis - Action Recognition & Understanding
Video analysis models for action recognition, temporal understanding, and video content classification
Diffusion LLMs - Non-Autoregressive Text Generation
Diffusion-based language models for text generation, discrete diffusion, and non-autoregressive NLP
Video Generation - Text-to-Video & AI Synthesis
Video generation models for text-to-video, image-to-video, and AI video synthesis
Graphics AI - Visual Computing & Image Synthesis
Graphics and visual computing models for rendering, image synthesis, and computer graphics AI
Meta VL-JEPA - Vision-Language Prediction Models
Meta VL-JEPA Vision-Language Joint Embedding Predictive Architecture for video understanding
Google Gemma Scope 2 - Neuronpedia
Google Gemma Scope 2: JumpReLU SAEs for Gemma 3 interpretability. 270M PT/IT, 1B PT variants. Neuronpedia integration. Mechanistic analysis.
Google Gemma - Quantized
Quantized Gemma 3 models: QAT for efficient deployment. Gemma-3-27B-IT Q4. Low memory, fast inference. Edge & production-ready LLMs.
Reward Models
NVIDIA Nemotron reward models: 340B, 8B BRRM, 70B/32B principle-based. RLHF training, preference learning, AI alignment research.
Google T5 Gemma 2
T5Gemma-2 encoder-decoder models: 270M, 1B, 4B sizes. Text-to-text, summarization, translation. Google's architecture for structured generation.
-
google/t5gemma-2-270m-270m
Image-Text-to-Text • 0.8B • Updated • 20.3k • 168 -
google/t5gemma-2-4b-4b
Image-Text-to-Text • 9B • Updated • 11.8k • 135 -
google/t5gemma-2-1b-1b
Image-Text-to-Text • 2B • Updated • 14.3k • 67 -
T5Gemma 2: Seeing, Reading, and Understanding Longer
Paper • 2512.14856 • Published • 1
Nvidia - Nemotron - Mamba/Transformers Combo Hybride
Hybrid Mamba + Transformer architectures. NVIDIA Nemotron-3-Nano-30B-A3B (32B). BF16 & GGUF. Efficient long-context & in-context learning.
-
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
Text Generation • 32B • Updated • 325k • 572 -
bartowski/nvidia_Nemotron-3-Nano-30B-A3B-GGUF
Text Generation • 32B • Updated • 8.83k • 9 -
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16
Text Generation • 32B • Updated • 25.3k • 90 -
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Text Generation • 32B • Updated • 797k • • 248
NVIDIA GR00T N1 - Humanoid Robotics Foundation Models
NVIDIA GR00T N1 humanoid robotics foundation models for embodied AI, manipulation, and robot learning
Google TranslateGemma - 55 Language Translation Models
Google TranslateGemma multilingual translation models supporting 55 languages for neural machine translation and cross-lingual NLP
NVIDIA Nemotron PII - Privacy & Data Protection Dataset
NVIDIA Nemotron PII dataset for personally identifiable information detection and privacy-aware NLP
Unitree Z1 Arm - Dual Dexterity Manipulation Data
Unitree Z1 robotic arm datasets for manipulation learning, grasp planning, and arm control training
-
unitreerobotics/Z1_Dual_Dex1_CleanupPencils_Dataset
Viewer • Updated • 133k • 374 • 2 -
unitreerobotics/Z1_Dual_Dex1_FoldClothes_Dataset
Viewer • Updated • 293k • 286 • 2 -
unitreerobotics/Z1_Dual_Dex1_PourCoffee_Dataset
Viewer • Updated • 443k • 640 • 1 -
unitreerobotics/Z1_Dual_Dex1_StackBox_Dataset
Viewer • Updated • 117k • 387 • 3
Unitree G1 Dex1 - Humanoid Robot Dexterity Datasets
Unitree G1 humanoid robot Dex1 dexterity datasets with mounted camera for manipulation learning
Unitree Robotics - G1_Dex3_datasets
Unitree G1 humanoid robot Dex3 advanced dexterity datasets for fine-grained manipulation tasks
-
unitreerobotics/G1_Dex3_BlockStacking_Dataset
Viewer • Updated • 281k • 1.02k • 2 -
unitreerobotics/G1_Dex3_CameraPackaging_Dataset
Viewer • Updated • 256k • 746 -
unitreerobotics/G1_Dex3_GraspSquare_Dataset
Viewer • Updated • 281k • 1.02k -
unitreerobotics/G1_Dex3_ObjectPlacement_Dataset
Viewer • Updated • 98.3k • 802 • 4
Unitree G1 BrainCo - Grasping & Manipulation Data
Unitree G1 humanoid robot BrainCo datasets for grasping, object manipulation, and dexterous hand training
-
unitreerobotics/G1_Brainco_GraspOreo_Dataset
Viewer • Updated • 235k • 622 -
unitreerobotics/G1_Brainco_GraspRubiksCube_Dataset
Viewer • Updated • 221k • 632 • 1 -
unitreerobotics/G1_Brainco_PickApple_Dataset
Viewer • Updated • 154k • 153 -
unitreerobotics/G1_Brainco_PickCharger_Dataset
Viewer • Updated • 217k • 613
Unitree UnifoLM WMA - World Model Agent for Robotics
Unitree UnifoLM World Model Agent for robot learning, action prediction, and embodied AI planning
Hugging Face - LeRobot - Pi0 (Old Version)
Hugging Face LeRobot Pi0 legacy version - archived robotics model for reference and compatibility
LeRobot Pi0.5 - Robotics Foundation Model v0.5
Hugging Face LeRobot Pi0.5 intermediate robotics model with improved action generation capabilities
Hugging Face - LeRobot - Open X-Embodiment
Open X-Embodiment robotics datasets: cross-platform robot learning for DROID, Kuka, TACO, JACO
LeRobot SmolVLA - Compact Vision-Language-Action
Hugging Face LeRobot SmolVLA compact vision-language-action model for efficient robot control
LeRobot Pi0 - HuggingFace Robotics Foundation Model
Hugging Face LeRobot Pi0 foundation model for robotics: manipulation, navigation, and embodied AI
Hugging Face - LeRobot - Behavior 1K
Hugging Face LeRobot Behavior-1K large-scale robotics benchmark for diverse manipulation tasks
LeRobot XVLA - Cross-Embodiment Vision-Language-Action
Hugging Face LeRobot XVLA cross-embodiment vision-language-action models for universal robot control
Atlas RL - Intelligent Architecture Reinforcement Learning
Atlas Reinforcement Learning - How to put the Intelligence into the Architecture?!
Hyper Graph Reasoning - Knowledge Graphs for AI Agents
Higher-order knowledge representations and hypergraph reasoning for agentic AI and scientific discovery
Dual RTX 6000 Build - 96GB VRAM Optimized LLMs
Optimized LLMs for dual NVIDIA RTX 6000 GPU setup - 96GB VRAM configurations for local inference
NVIDIA Nemotron Orchestrator - Multi-Model Routing
NVIDIA Nemotron Orchestrator 8B for multi-model coordination, task routing, and agentic workflows
LeRobot Pi0Fast - Real-Time Robotics Inference
Hugging Face LeRobot Pi0Fast optimized robotics models for real-time inference and fast action generation
Meta RoBERTa - Pretrained NLP & Text Classification
Meta RoBERTa pretrained language models for NLP tasks: classification, NER, sentiment analysis
-
FacebookAI/roberta-base
Fill-Mask • 0.1B • Updated • 9.72M • • 551 -
FacebookAI/xlm-roberta-large-finetuned-conll03-german
Token Classification • Updated • 4.68k • • 14 -
FacebookAI/xlm-roberta-large-finetuned-conll02-spanish
Fill-Mask • 0.6B • Updated • 38 • 2 -
FacebookAI/xlm-roberta-large
Fill-Mask • 0.6B • Updated • 3.49M • • 487
Google Embedding Gemma - Text Embeddings for RAG
Google Embedding Gemma models for semantic search, RAG applications, and text embeddings
NVIDIA Physical AI - Autonomous Vehicles & Robotics
NVIDIA Physical AI models for robotics, embodied intelligence, and real-world interaction
Nvidia Thor + Rasberry + Oak 4D Dual Build
NVIDIA Thor SoC with Raspberry Pi and OAK-4D stereo camera for edge robotics and embodied AI
Qwen3 VL Reranker - Multimodal RAG Ranking Models
Qwen3 Vision-Language reranker models for RAG pipelines and multimodal document retrieval
Qwen3 VL Embeddings - Multimodal Vector Search
Qwen3 Vision-Language embedding models for multimodal RAG, semantic search, and vector databases
Facebook/Meta - Research Plan Dataset
Meta Research Plan datasets for AI research planning, scientific reasoning, and agent workflows
NVIDIA Nemotron Content Safety - Toxicity Detection
NVIDIA Llama Nemotron content safety models for toxicity detection and safe AI deployment
NVIDIA Clara Medical - Healthcare & Clinical NLP
NVIDIA Clara medical AI models for healthcare, clinical NLP, and medical imaging analysis
NVIDIA Clara Biology - Genomics & Protein AI
NVIDIA Clara biology models for genomics, protein structure, and computational biology research
NVIDIA Clara Molecular - Drug Discovery & Chemistry
NVIDIA Clara molecular models for drug discovery, molecular property prediction, and computational chemistry
NVIDIA Clara Medical - Clinical AI & Radiology
NVIDIA Clara medical AI for clinical NLP, radiology analysis, and healthcare decision support
Nvidia Nemotron RAG - Reranking
NVIDIA Nemotron reranking models for RAG pipelines, search result optimization, and document ranking
NVIDIA Nemotron Embeddings - RAG & Vector Search
NVIDIA Nemotron embedding models for RAG, semantic search, and vector database applications
-
nvidia/llama-nemotron-embed-vl-1b-v2
Feature Extraction • 2B • Updated • 1.98k • 14 -
nvidia/llama-nemotron-embed-1b-v2
Feature Extraction • 1B • Updated • 20.8k • 30 -
nvidia/llama-embed-nemotron-8b
Feature Extraction • 8B • Updated • 470k • 120 -
nvidia/NV-Embed-v2
Feature Extraction • 8B • Updated • 24.5k • 498
NVIDIA Alpamayo-R1 - Reasoning & Physical AI Models
NVIDIA Alpamayo-R1 reasoning model for complex problem solving, mathematical reasoning, and chain-of-thought
-
nvidia/Alpamayo-R1-10B
Robotics • 11B • Updated • 23.9k • 302 -
nvidia/PhysicalAI-Autonomous-Vehicles
Updated • 155k • 693 -
nvidia/PhysicalAI-Autonomous-Vehicles-NuRec
Updated • 9.51k • 110 -
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail
Paper • 2511.00088 • Published • 3
DiT - Diffusion Transformer for Video & Audio Gen
Diffusion Transformer models for multimodal video and audio generation, synthesis, and editing
-
Lightricks/LTX-2
Image-to-Video • Updated • 1.36M • • 1.07k -
Masked Audio Generation using a Single Non-Autoregressive Transformer
Paper • 2401.04577 • Published • 44 -
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 120 -
YOLO-World: Real-Time Open-Vocabulary Object Detection
Paper • 2401.17270 • Published • 43
NVIDIA Nemotron Speech - ASR & Text-to-Speech
NVIDIA Nemotron speech models for ASR, text-to-speech, and voice AI applications
-
nvidia/nemotron-speech-streaming-en-0.6b
Automatic Speech Recognition • Updated • 4.94k • 388 -
nvidia/parakeet-tdt-0.6b-v3
Automatic Speech Recognition • Updated • 71.6k • 552 -
nvidia/parakeet_realtime_eou_120m-v1
Updated • 588 • 106 -
nvidia/multitalker-parakeet-streaming-0.6b-v1
Automatic Speech Recognition • Updated • 624 • 63
NVIDIA Nemotron Cascade - Multi-Stage LLM Inference
NVIDIA Nemotron Cascade for multi-stage inference, model routing, and efficient LLM deployment
-
nvidia/Nemotron-Cascade-8B
Text Generation • 8B • Updated • 5.96k • 55 -
nvidia/Nemotron-Cascade-8B-Thinking
Text Generation • 8B • Updated • 1.11k • 34 -
nvidia/Nemotron-Cascade-14B-Thinking
Text Generation • 15B • Updated • 2.67k • 65 -
nvidia/Nemotron-Cascade-8B-Intermediate-ckpts
Text Generation • Updated • 10
OpenAI GPT-OSS - Steering Vectors & SAE Research
Open-source GPT models with steering vectors for controllable generation and behavior modification
Google Gemma 3 LiteRT - Mobile & Edge Optimized
Google Gemma 3 LiteRT models optimized for TensorFlow Lite runtime and mobile edge deployment
-
google/functiongemma-270m-it
Text Generation • 0.3B • Updated • 89.8k • 817 -
litert-community/embeddinggemma-300m
Sentence Similarity • Updated • 1.26k • 31 -
google/gemma-3n-E2B-it-litert-lm
Text Generation • Updated • 19.9k • 274 -
google/gemma-3n-E4B-it-litert-lm
Text Generation • Updated • 20.7k • 282
NVIDIA Cosmos Reason 2 - World Model Reasoning
NVIDIA Cosmos 2 Reason models for world model reasoning, physics simulation, and causal understanding
NVIDIA Cosmos Transfer 2.5 - Style & Domain Transfer
NVIDIA Cosmos 2.5 Transfer models for domain adaptation, style transfer, and video generation
NVIDIA Cosmos 2 - Cosmos-Predict 2.5
NVIDIA Cosmos 2.5 Predict models for world simulation, future frame prediction, and physical AI
Robotics - Foundation Models for Embodied AI
Robotics foundation models, datasets, and research for embodied AI, manipulation, and autonomous systems
Edge & Smartphone - On-Device Mobile AI Models
On-device AI models optimized for smartphone deployment: mobile LLMs, edge inference, and efficient architectures
NVIDIA NeMo Gym - RL Agent Training Datasets
NVIDIA Nemotron reinforcement learning datasets from NeMo Gym for agent training and RLHF
-
nvidia/Nemotron-RL-knowledge-web_search-mcqa
Viewer • Updated • 2.93k • 267 • 7 -
nvidia/Nemotron-RL-agent-workplace_assistant
Viewer • Updated • 1.8k • 239 • 12 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 165 • 9 -
nvidia/Nemotron-RL-instruction_following-structured_outputs
Viewer • Updated • 9.95k • 207 • 25
NVIDIA Nemotron Safety - AI Alignment Datasets
NVIDIA Nemotron safety datasets for AI alignment, content moderation, and responsible AI training
-
nvidia/Nemotron-AIQ-Agentic-Safety-Dataset-1.0
Viewer • Updated • 10.8k • 1.26k • 10 -
nvidia/Nemotron-Content-Safety-Reasoning-Dataset
Preview • Updated • 72 • 5 -
nvidia/Aegis-AI-Content-Safety-Dataset-2.0
Viewer • Updated • 33.4k • 2.99k • 72 -
nvidia/Nemotron-Content-Safety-Audio-Dataset
Viewer • Updated • 1.93k • 1.34k • 3
NVIDIA Nemotron RAG Datasets - Retrieval Training
NVIDIA Nemotron RAG datasets for retrieval-augmented generation, document QA, and knowledge grounding
NVIDIA Nemotron VLM - Vision-Language Training Data
NVIDIA Nemotron vision-language datasets for multimodal training, image understanding, and VLM finetuning
Google Gemma 3N - Mobile multimordal Edition
Google Gemma 3N mobile multimodal models for on-device vision-language tasks and efficient edge deployment
Deep Thinking - Extended Chain-of-Thought Reasoning
Deep thinking and reasoning models for extended chain-of-thought, deliberative alignment, and complex problem solving
Small Thinking - Compact Reasoning Models for Edge
Compact reasoning models for efficient chain-of-thought inference on resource-constrained devices
Small Coders - Lightweight Code Generation Models
Lightweight code generation models for edge deployment, IDE integration, and fast code completion
Self-Correcting Delta Transformer - Adaptive LLMs
Self-Correcting Delta Transformer - DDL provides the Hardware mechanism (The Erazor), NL solves the software problem.
YOLO - Real-Time Object Detection Models
YOLO object detection models for real-time computer vision, autonomous systems, and video analytics
Meta SAM - Segment Anything Models (Image & Audio)
Meta SAM Segment Anything models for zero-shot image segmentation, object detection, and visual understanding
Affordable Coding APIs - Cost-Effective LLM Endpoints
Cost-effective coding API providers and affordable LLM endpoints for development and prototyping
Edge LLMs - Ultra-Compact High-Performance Models
Ultra-compact LLMs for edge deployment: sub-1B parameter models with strong performance for IoT and mobile
RLM - Neuro-Symbolic Architecture - Reasonig Traces
Inference Wrapper - Models: Root LLM (The Architect) + Python REPL (The Engine) + Sub LLMs (The Workers) Spaned by querys
NVIDIA Nemotron Personas - Regional Character Data
NVIDIA Nemotron persona datasets for character AI, personality modeling, and conversational agent training
NVIDIA Nemotron Post-Training - RLHF & SFT Data
NVIDIA Nemotron post-training datasets for RLHF, instruction tuning, and alignment fine-tuning
NVIDIA Nemotron Reward - RLHF & Alignment Models
LLM as a judge
-
nvidia/Llama-3.3-Nemotron-70B-Reward-Principle
Text Generation • 71B • Updated • 94 • 6 -
nvidia/Qwen3-Nemotron-32B-GenRM-Principle
Text Generation • 33B • Updated • 312 • 11 -
nvidia/Qwen3-Nemotron-32B-RLBFF
Text Generation • 33B • Updated • 56 • 27 -
nvidia/Qwen3-Nemotron-8B-BRRM
Text Generation • Updated • 137 • 8
NVIDIA Nemotron Pre-Training - Foundation Model Data
NVIDIA Nemotron pre-training datasets for large language model training and foundation model development
Embeddings - Semantic Search & RAG Vector Models
Text and multimodal embedding models for semantic search, RAG pipelines, and vector similarity applications
Topological Transformer - Deepseek
: Manifold-Constrained Hyper-Connections
Edge Translation - On-Device Multilingual NLP
Edge-optimized translation models for on-device multilingual NLP and low-latency language translation
Deep Research - Autonomous AI Literature Review
Deep research AI agents and models for autonomous literature review, scientific reasoning, and knowledge synthesis
Qwen Long Reasoning - Extended Context CoT Models
Qwen long-context reasoning models for extended chain-of-thought, complex problem solving, and mathematical reasoning
PP-StructureV3 - Document Analysis & Table OCR
PaddlePaddle PP-StructureV3 for document analysis, table recognition, and intelligent document processing
OCR Models - Optical Character Recognition & Text Extraction
Optical Character Recognition models for text extraction, document digitization, and scene text detection
-
PaddlePaddle/PaddleOCR-VL
Image-Text-to-Text • 1.0B • Updated • 12.5k • 1.49k -
baidu/ERNIE-4.5-0.3B-Paddle
Text Generation • 0.4B • Updated • 76 • 19 -
baidu/ERNIE-4.5-21B-A3B-Paddle
Text Generation • 22B • Updated • 63 • 13 -
ibm-granite/granite-docling-258M
Image-Text-to-Text • 0.3B • Updated • 212k • 1.09k
Circuit Sparsity - Neural Network Interpretability
Circuit sparsity research for neural network pruning, mechanistic interpretability, and efficient model compression
IQuest LoopCoder - Iterative Code Generation Models
Iquenst LoopCoder models for iterative code generation, self-refinement, and automated debugging
Text to Motion - Human Animation & Gesture AI
Text-to-motion generation models for human animation, gesture synthesis, and motion capture AI
ASR Models - Automatic Speech Recognition & Transcription
Automatic Speech Recognition models for transcription, voice AI, and multilingual speech-to-text
TTS - Text-to-Speech & Voice Synthesis Models
Text-to-Speech models for voice synthesis, neural TTS, and natural language audio generation
Mobile App AI - On-Device Agents & Function Calling
Mobile app engine models for on-device AI, app development automation, and mobile-first ML
Audio Segmenting - Meta SAM 3 Audio
Audio segmentation models based on Meta SAM architecture for sound separation and audio understanding
Hybrid Attention - Efficient Transformer Architectures
Hybrid attention models combining local and global attention for efficient long-context processing
NVIDIA Nemotron V3 - Post-Training Datasets
Mamba/Transformers Combo Hybride
Datasets Pretraining - Nemotron V3
Mamba/Transformers Combo Hybride
Open Source AI - Fully Open Weights & Training Data
Fully open-source AI models with permissive licenses for commercial use and research
Byte Level Models - Tokenizer-Free Language Models
Byte-level language models for tokenizer-free NLP, multilingual text, and raw byte processing
Image to 3D - Single-Image 3D Reconstruction
Image-to-3D generation models for single-image 3D reconstruction, mesh generation, and 3D asset creation
Video Analysis - Action Recognition & Understanding
Video analysis models for action recognition, temporal understanding, and video content classification
Deep Research Agents - Specialized Search & Reasoning
Specialized deep research models for domain-specific scientific reasoning and literature analysis
Diffusion LLMs - Non-Autoregressive Text Generation
Diffusion-based language models for text generation, discrete diffusion, and non-autoregressive NLP
IBM Granite - Enterprise AI & Code Generation
IBM Granite foundation models for enterprise AI, code generation, and multilingual NLP tasks
Video Generation - Text-to-Video & AI Synthesis
Video generation models for text-to-video, image-to-video, and AI video synthesis
Small OCR - Lightweight Text Recognition for Edge
Lightweight OCR models for edge deployment, mobile text recognition, and efficient document processing
Graphics AI - Visual Computing & Image Synthesis
Graphics and visual computing models for rendering, image synthesis, and computer graphics AI
Hierarchical RL - Multi-Level Decision Making
Hierarchical reinforcement learning models for multi-level decision making and complex task decomposition
Meta VL-JEPA - Vision-Language Prediction Models
Meta VL-JEPA Vision-Language Joint Embedding Predictive Architecture for video understanding
Bread & Butter - Top Production-Ready LLMs 2025
Top LLMs 2025: ZAI GLM-4.7 (358B) & Moonshot Kimi-K2-Thinking. Next-gen reasoning, code, multilingual. State-of-the-art performance. Production-ready.
Google Gemma Scope 2 - Neuronpedia
Google Gemma Scope 2: JumpReLU SAEs for Gemma 3 interpretability. 270M PT/IT, 1B PT variants. Neuronpedia integration. Mechanistic analysis.
Haddock Custom Sparse Autodecoders
Custom JumpReLU Sparse Autoencoders for mechanistic interpretability. T5Gemma-2 SAEs across all layers. AI safety & interpretability research.
Google Gemma - Quantized
Quantized Gemma 3 models: QAT for efficient deployment. Gemma-3-27B-IT Q4. Low memory, fast inference. Edge & production-ready LLMs.
Nvidia Nemo-Gym
NVIDIA Nemotron RL datasets for AI agent training. Web search, workplace tasks, instruction following, structured outputs. RLHF & alignment research.
-
nvidia/Nemotron-RL-knowledge-web_search-mcqa
Viewer • Updated • 2.93k • 267 • 7 -
nvidia/Nemotron-RL-agent-workplace_assistant
Viewer • Updated • 1.8k • 239 • 12 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 165 • 9 -
nvidia/Nemotron-RL-instruction_following-structured_outputs
Viewer • Updated • 9.95k • 207 • 25
Reward Models
NVIDIA Nemotron reward models: 340B, 8B BRRM, 70B/32B principle-based. RLHF training, preference learning, AI alignment research.
Trained
Custom-trained models by mindchain: reward models & SAEs. Haddock Reward Mini (8B), function-specific SAEs. AI interpretability & alignment research.
Google T5 Gemma 2
T5Gemma-2 encoder-decoder models: 270M, 1B, 4B sizes. Text-to-text, summarization, translation. Google's architecture for structured generation.
-
google/t5gemma-2-270m-270m
Image-Text-to-Text • 0.8B • Updated • 20.3k • 168 -
google/t5gemma-2-4b-4b
Image-Text-to-Text • 9B • Updated • 11.8k • 135 -
google/t5gemma-2-1b-1b
Image-Text-to-Text • 2B • Updated • 14.3k • 67 -
T5Gemma 2: Seeing, Reading, and Understanding Longer
Paper • 2512.14856 • Published • 1
Auto Decoders
JumpReLU SAEs for Gemma 3 interpretability. EleutherAI models, DeepSeek-R1, Pythia SAEs. Mechanistic interpretability & AI safety research.
Nvidia - Nemotron - Mamba/Transformers Combo Hybride
Hybrid Mamba + Transformer architectures. NVIDIA Nemotron-3-Nano-30B-A3B (32B). BF16 & GGUF. Efficient long-context & in-context learning.
-
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
Text Generation • 32B • Updated • 325k • 572 -
bartowski/nvidia_Nemotron-3-Nano-30B-A3B-GGUF
Text Generation • 32B • Updated • 8.83k • 9 -
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16
Text Generation • 32B • Updated • 25.3k • 90 -
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Text Generation • 32B • Updated • 797k • • 248
Google FunctionGemma (Gemma 3)
Function calling: Google FunctionGemma-270m-IT & Mobile Actions dataset (9.65k). Efficient tool use in small LMs. AI agent development.