-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 420 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
Paper • 2409.12576 • Published • 16 -
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper • 2408.04619 • Published • 172
Collections
Discover the best community collections!
Collections including paper arxiv:2409.12917
-
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper • 2503.01785 • Published • 84 -
When an LLM is apprehensive about its answers -- and when its uncertainty is justified
Paper • 2503.01688 • Published • 21 -
Predictive Data Selection: The Data That Predicts Is the Data That Teaches
Paper • 2503.00808 • Published • 56 -
Chain of Draft: Thinking Faster by Writing Less
Paper • 2502.18600 • Published • 49
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 144 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Paper • 2409.02795 • Published • 72 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 92
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation
Paper • 2502.09411 • Published • 22 -
GAIR/lima
Viewer • Updated • 1.33k • 585 • 445 -
HuggingFaceM4/FineVision
Viewer • Updated • 24.2M • 246k • 415
-
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance
Paper • 2405.06682 • Published • 3 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
Rethinking Chain-of-Thought from the Perspective of Self-Training
Paper • 2412.10827 • Published -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 5
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 420 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
Paper • 2409.12576 • Published • 16 -
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper • 2408.04619 • Published • 172
-
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance
Paper • 2405.06682 • Published • 3 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
Rethinking Chain-of-Thought from the Perspective of Self-Training
Paper • 2412.10827 • Published -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 5
-
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper • 2503.01785 • Published • 84 -
When an LLM is apprehensive about its answers -- and when its uncertainty is justified
Paper • 2503.01688 • Published • 21 -
Predictive Data Selection: The Data That Predicts Is the Data That Teaches
Paper • 2503.00808 • Published • 56 -
Chain of Draft: Thinking Faster by Writing Less
Paper • 2502.18600 • Published • 49
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 144 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Paper • 2409.02795 • Published • 72 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 92
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation
Paper • 2502.09411 • Published • 22 -
GAIR/lima
Viewer • Updated • 1.33k • 585 • 445 -
HuggingFaceM4/FineVision
Viewer • Updated • 24.2M • 246k • 415