AlphaResearch: Accelerating New Algorithm Discovery with Language Models
Abstract
AlphaResearch, an autonomous research agent, discovers new algorithms in open-ended problems with a dual research environment, achieving competitive performance against human researchers in a benchmark competition.
Large language models have made significant progress in complex but easy-to-verify problems, yet they still struggle with discovering the unknown. In this paper, we present AlphaResearch, an autonomous research agent designed to discover new algorithms on open-ended problems. To synergize the feasibility and innovation of the discovery process, we construct a novel dual research environment by combining the execution-based verify and simulated real-world peer review environment. AlphaResearch discovers new algorithm by iteratively running the following steps: (1) propose new ideas (2) verify the ideas in the dual research environment (3) optimize the research proposals for better performance. To promote a transparent evaluation process, we construct AlphaResearchComp, a new evaluation benchmark that includes an eight open-ended algorithmic problems competition, with each problem carefully curated and verified through executable pipelines, objective metrics, and reproducibility checks. AlphaResearch gets a 2/8 win rate in head-to-head comparison with human researchers, demonstrate the possibility of accelerating algorithm discovery with LLMs. Notably, the algorithm discovered by AlphaResearch on the ``packing circles'' problem achieves the best-of-known performance, surpassing the results of human researchers and strong baselines from recent work (e.g., AlphaEvolve). Additionally, we conduct a comprehensive analysis of the remaining challenges of the 6/8 failure cases, providing valuable insights for future research.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Scientific Algorithm Discovery by Augmenting AlphaEvolve with Deep Research (2025)
- RECODE-H: A Benchmark for Research Code Development with Interactive Human Feedback (2025)
- FML-bench: A Benchmark for Automatic ML Research Agents Highlighting the Importance of Exploration Breadth (2025)
- Idea2Plan: Exploring AI-Powered Research Planning (2025)
- TGPR: Tree-Guided Policy Refinement for Robust Self-Debugging of LLMs (2025)
- THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning (2025)
- Deep Ideation: Designing LLM Agents to Generate Novel Research Ideas on Scientific Concept Network (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper