Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.00662

Process Reward Modeling

GroundedPRM: Tree-Guided and Fidelity-Aware Process Reward Modeling for Step-Level Reasoning

Paper • 2510.14942 • Published 13 days ago • 2
Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models

Paper • 2508.15202 • Published Aug 21 • 4
DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published May 1 • 54

Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks

Paper • 2505.00234 • Published May 1 • 26
DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published May 1 • 54
A Survey of Interactive Generative Video

Paper • 2504.21853 • Published Apr 30 • 46
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

Paper • 2506.02397 • Published Jun 3 • 35

about 6 hours ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 11.1k • 1.21k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17 • 420 • 15
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 62

Papers + RL/Reasoning

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 141
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7 • 26
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Paper • 2504.08600 • Published Apr 11 • 31
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Paper • 2504.11343 • Published Apr 15 • 19

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 23
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 13
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

3DV-TON: Textured 3D-Guided Consistent Video Try-on via Diffusion Models

Paper • 2504.17414 • Published Apr 24 • 18
DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published May 1 • 54

I'm watching Paper, Upvote Paper

The 🤗 Papers I'm looking at (have looked at) are all here.

DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published May 1 • 54
A Survey of Interactive Generative Video

Paper • 2504.21853 • Published Apr 30 • 46
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution

Paper • 2505.00497 • Published May 1 • 17
Running

33

33

Keysync Demo

📈

Generate synchronized video from audio and video inputs

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 84

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 52
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Paper • 2412.12094 • Published Dec 16, 2024 • 11
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Paper • 2306.07691 • Published Jun 13, 2023 • 12
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform

Paper • 2203.02395 • Published Mar 4, 2022 • 1

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 15 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

Process Reward Modeling

GroundedPRM: Tree-Guided and Fidelity-Aware Process Reward Modeling for Step-Level Reasoning

Paper • 2510.14942 • Published 13 days ago • 2
Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models

Paper • 2508.15202 • Published Aug 21 • 4
DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published May 1 • 54

3DV-TON: Textured 3D-Guided Consistent Video Try-on via Diffusion Models

Paper • 2504.17414 • Published Apr 24 • 18
DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published May 1 • 54

Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks

Paper • 2505.00234 • Published May 1 • 26
DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published May 1 • 54
A Survey of Interactive Generative Video

Paper • 2504.21853 • Published Apr 30 • 46
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

Paper • 2506.02397 • Published Jun 3 • 35

I'm watching Paper, Upvote Paper

The 🤗 Papers I'm looking at (have looked at) are all here.

DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published May 1 • 54
A Survey of Interactive Generative Video

Paper • 2504.21853 • Published Apr 30 • 46
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution

Paper • 2505.00497 • Published May 1 • 17
Running

33

33

Keysync Demo

📈

Generate synchronized video from audio and video inputs

about 6 hours ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 11.1k • 1.21k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17 • 420 • 15
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 62

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 84

Papers + RL/Reasoning

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 141
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7 • 26
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Paper • 2504.08600 • Published Apr 11 • 31
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Paper • 2504.11343 • Published Apr 15 • 19

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 52
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Paper • 2412.12094 • Published Dec 16, 2024 • 11
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Paper • 2306.07691 • Published Jun 13, 2023 • 12
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform

Paper • 2203.02395 • Published Mar 4, 2022 • 1

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 23
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 13
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 15 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs