MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published Feb 20 • 192
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published 11 days ago • 131
A Family of Pretrained Transformer Language Models for Russian Paper • 2309.10931 • Published Sep 19, 2023 • 5
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark Paper • 2010.15925 • Published Oct 29, 2020
Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models Paper • 2202.07791 • Published Feb 15, 2022
Findings of the The RuATD Shared Task 2022 on Artificial Text Detection in Russian Paper • 2206.01583 • Published Jun 3, 2022 • 1
Vote'n'Rank: Revision of Benchmarking with Social Choice Theory Paper • 2210.05769 • Published Oct 11, 2022
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published Feb 20 • 192
MAF: Multi-Aspect Feedback for Improving Reasoning in Large Language Models Paper • 2310.12426 • Published Oct 19, 2023 • 1
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies Paper • 2308.03188 • Published Aug 6, 2023 • 2
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 34
WikiOmnia: generative QA corpus on the whole Russian Wikipedia Paper • 2204.08009 • Published Apr 17, 2022