Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team Paper • 2506.14234 • Published Jun 17 • 41
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging Paper • 2502.05664 • Published Feb 8 • 24
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models Paper • 2501.00316 • Published Dec 31, 2024 • 23