Collections
Discover the best community collections!
Collections including paper arxiv:2409.12917
-
INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
Paper • 2306.04757 • Published • 6 -
Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation
Paper • 2308.01240 • Published • 2 -
Can Large Language Models Understand Real-World Complex Instructions?
Paper • 2309.09150 • Published • 2 -
Evaluating the Instruction-Following Robustness of Large Language Models to Prompt Injection
Paper • 2308.10819 • Published
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 141 -
Recursive Introspection: Teaching Language Model Agents How to Self-Improve
Paper • 2407.18219 • Published • 3 -
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
Paper • 2408.16293 • Published • 27 -
Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models
Paper • 2409.04787 • Published • 1
-
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
Paper • 2408.06195 • Published • 73 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 141 -
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Paper • 2408.03314 • Published • 63 -
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance
Paper • 2405.06682 • Published • 3
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 141 -
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models
Paper • 2409.18943 • Published • 29 -
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Paper • 2411.16594 • Published • 41 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 141 -
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Paper • 2409.12191 • Published • 78 -
Expect the Unexpected: FailSafe Long Context QA for Finance
Paper • 2502.06329 • Published • 133 -
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 68
-
INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
Paper • 2306.04757 • Published • 6 -
Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation
Paper • 2308.01240 • Published • 2 -
Can Large Language Models Understand Real-World Complex Instructions?
Paper • 2309.09150 • Published • 2 -
Evaluating the Instruction-Following Robustness of Large Language Models to Prompt Injection
Paper • 2308.10819 • Published
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 141 -
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models
Paper • 2409.18943 • Published • 29 -
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Paper • 2411.16594 • Published • 41 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 141 -
Recursive Introspection: Teaching Language Model Agents How to Self-Improve
Paper • 2407.18219 • Published • 3 -
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
Paper • 2408.16293 • Published • 27 -
Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models
Paper • 2409.04787 • Published • 1
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 141 -
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Paper • 2409.12191 • Published • 78 -
Expect the Unexpected: FailSafe Long Context QA for Finance
Paper • 2502.06329 • Published • 133 -
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 68
-
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
Paper • 2408.06195 • Published • 73 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 141 -
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Paper • 2408.03314 • Published • 63 -
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance
Paper • 2405.06682 • Published • 3