HaluMem: Evaluating Hallucinations in Memory Systems of Agents Paper • 2511.03506 • Published 24 days ago • 91
HaluMem: Evaluating Hallucinations in Memory Systems of Agents Paper • 2511.03506 • Published 24 days ago • 91
HaluMem: Evaluating Hallucinations in Memory Systems of Agents Paper • 2511.03506 • Published 24 days ago • 91 • 3
Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving Paper • 2505.04528 • Published May 7 • 12
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published Apr 14 • 85 • 2
Grimoire is All You Need for Enhancing Large Language Models Paper • 2401.03385 • Published Jan 7, 2024 • 5
xFinder: Robust and Pinpoint Answer Extraction for Large Language Models Paper • 2405.11874 • Published May 20, 2024 • 7
HRDE: Retrieval-Augmented Large Language Models for Chinese Health Rumor Detection and Explainability Paper • 2407.00668 • Published Jun 30, 2024 • 3
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published Apr 14 • 85
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published Apr 14 • 85
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published Mar 10 • 68
SurveyX: Academic Survey Automation via Large Language Models Paper • 2502.14776 • Published Feb 20 • 100