ATLAS: Learning to Optimally Memorize the Context at Test Time Paper • 2505.23735 • Published May 29 • 22
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 38 items • Updated 19 days ago • 57
Discovering Preference Optimization Algorithms with and for Large Language Models Paper • 2406.08414 • Published Jun 12, 2024 • 16