Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2504.03553

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7 • 26
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models

Paper • 2504.04718 • Published Apr 7 • 42
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

Paper • 2504.03561 • Published Apr 4 • 18
Concept Lancet: Image Editing with Compositional Representation Transplant

Paper • 2504.02828 • Published Apr 3 • 16

Agentic Knowledgeable Self-awareness

Paper • 2504.03553 • Published Apr 4 • 27
Benchmarking LLMs' Swarm intelligence

Paper • 2505.04364 • Published May 7 • 20
Multi-Agent System for Comprehensive Soccer Understanding

Paper • 2505.03735 • Published May 6 • 25
LIMI: Less is More for Agency

Paper • 2509.17567 • Published Sep 22 • 100

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

Paper • 2503.10615 • Published Mar 13 • 17
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation

Paper • 2503.10630 • Published Mar 13 • 6
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

Paper • 2503.07536 • Published Mar 10 • 88

收集的感兴趣的AI

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published Feb 20 • 192
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published Feb 20 • 104
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published Feb 20 • 91
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

Paper • 2502.14282 • Published Feb 20 • 29

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Paper • 2412.11605 • Published Dec 16, 2024 • 18
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval

Paper • 2412.15443 • Published Dec 19, 2024 • 10

LangModels-Advances-2025

Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models

Paper • 2504.04823 • Published Apr 7 • 31
Agentic Knowledgeable Self-awareness

Paper • 2504.03553 • Published Apr 4 • 27
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 200

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 119
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 113
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 141

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 84
When an LLM is apprehensive about its answers -- and when its uncertainty is justified

Paper • 2503.01688 • Published Mar 3 • 21
Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 56
Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 49

zjunlp/KnowSelf-Llama3.1-8B-ALFWorld

8B • Updated Feb 22 • 1
zjunlp/KnowSelf-Llama3.1-8B-WebShop

8B • Updated Feb 22 • 1
zjunlp/KnowSelf-Gemma2-2B-ALFWorld

3B • Updated Feb 22
zjunlp/KnowSelf-Gemma2-2B-WebShop

3B • Updated Apr 4 • 3

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7 • 26
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models

Paper • 2504.04718 • Published Apr 7 • 42
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

Paper • 2504.03561 • Published Apr 4 • 18
Concept Lancet: Image Editing with Compositional Representation Transplant

Paper • 2504.02828 • Published Apr 3 • 16

LangModels-Advances-2025

Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models

Paper • 2504.04823 • Published Apr 7 • 31
Agentic Knowledgeable Self-awareness

Paper • 2504.03553 • Published Apr 4 • 27
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 200

Agentic Knowledgeable Self-awareness

Paper • 2504.03553 • Published Apr 4 • 27
Benchmarking LLMs' Swarm intelligence

Paper • 2505.04364 • Published May 7 • 20
Multi-Agent System for Comprehensive Soccer Understanding

Paper • 2505.03735 • Published May 6 • 25
LIMI: Less is More for Agency

Paper • 2509.17567 • Published Sep 22 • 100

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 119
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 113
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 141

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

Paper • 2503.10615 • Published Mar 13 • 17
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation

Paper • 2503.10630 • Published Mar 13 • 6
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

Paper • 2503.07536 • Published Mar 10 • 88

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 84
When an LLM is apprehensive about its answers -- and when its uncertainty is justified

Paper • 2503.01688 • Published Mar 3 • 21
Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 56
Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 49

收集的感兴趣的AI

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published Feb 20 • 192
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published Feb 20 • 104
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published Feb 20 • 91
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

Paper • 2502.14282 • Published Feb 20 • 29

zjunlp/KnowSelf-Llama3.1-8B-ALFWorld

8B • Updated Feb 22 • 1
zjunlp/KnowSelf-Llama3.1-8B-WebShop

8B • Updated Feb 22 • 1
zjunlp/KnowSelf-Gemma2-2B-ALFWorld

3B • Updated Feb 22
zjunlp/KnowSelf-Gemma2-2B-WebShop

3B • Updated Apr 4 • 3

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Paper • 2412.11605 • Published Dec 16, 2024 • 18
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval

Paper • 2412.15443 • Published Dec 19, 2024 • 10

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs