1 21 1

AnIdealRing

SmartDazi

AI & ML interests

None yet

Recent Activity

upvoted a paper 14 days ago

Redesign Mixture-of-Experts Routers with Manifold Power Iteration

upvoted a paper 15 days ago

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

upvoted a paper 20 days ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

View all activity

Organizations

upvoted a paper 14 days ago

Redesign Mixture-of-Experts Routers with Manifold Power Iteration

Paper • 2606.12397 • Published 15 days ago • 87

upvoted a paper 15 days ago

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Paper • 2606.09730 • Published 17 days ago • 52

upvoted a paper 20 days ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Paper • 2606.04703 • Published 22 days ago • 25

upvoted a paper 28 days ago

LiveBrowseComp: Are Search Agents Searching, or Just Verifying What They Already Know?

Paper • 2605.28721 • Published 29 days ago • 17

upvoted 2 papers 3 months ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published Mar 17 • 141

AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents

Paper • 2603.14465 • Published Mar 15 • 23

liked a dataset 3 months ago

LulaCola/AgentProcessBench

Viewer • Updated Mar 18 • 1k • 227 • 15

upvoted 3 papers 4 months ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 60

Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation

Paper • 2602.12125 • Published Feb 12 • 68

Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models

Paper • 2602.04649 • Published Feb 4 • 13

upvoted 3 papers 5 months ago

updated 2 models 5 months ago

openbmb/AgentCPM-Explore-GGUF

4B • Updated Jan 17 • 288 • 27

openbmb/AgentCPM-Explore

Text Generation • 4B • Updated Jan 18 • 499 • • 415

upvoted a paper 6 months ago

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published Dec 11, 2025 • 47

upvoted a paper 7 months ago

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Paper • 2512.07461 • Published Dec 8, 2025 • 80

upvoted a paper 8 months ago

LaSeR: Reinforcement Learning with Last-Token Self-Rewarding

Paper • 2510.14943 • Published Oct 16, 2025 • 40

upvoted 2 papers 11 months ago

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published Aug 7, 2025 • 133

WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models

Paper • 2411.05451 • Published Nov 8, 2024 • 2

AnIdealRing

AI & ML interests

Recent Activity

Organizations

SmartDazi's activity