BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published 24 days ago • 34
view article Article BigCodeArena: Judging code generations end to end with code executions By bigcode • 27 days ago • 16
⚔️ BigCodeArena Collection Unveiling More Reliable Human Preferences in Code Generation via Execution • 8 items • Updated 20 days ago • 4
Training Language Model Agents to Find Vulnerabilities with CTF-Dojo Paper • 2508.18370 • Published Aug 25 • 3
view article Article Blazing-Fast Code Editing via Multi-Layer Speculation By ganler and 3 others • Feb 15 • 17
Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning Paper • 2410.03103 • Published Oct 4, 2024 • 9
RegMix: Data Mixture as Regression for Language Model Pre-training Paper • 2407.01492 • Published Jul 1, 2024 • 40
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22, 2024 • 48
🌸BigCodeBench Collection Benchmarking Code Generation with Diverse Function Calls and Complex Instructions https://bigcode-bench.github.io/ • 8 items • Updated Nov 12, 2024 • 4
view article Article BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks Jun 18, 2024 • 52
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper • 2404.00399 • Published Mar 30, 2024 • 42
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models Paper • 2401.00788 • Published Jan 1, 2024 • 23
Pop Quiz! Do Pre-trained Code Models Possess Knowledge of Correct API Names? Paper • 2309.07804 • Published Sep 14, 2023 • 2