view article Article mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL Sep 11 • 25
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 22 items • Updated 13 days ago • 119
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers Paper • 2504.20752 • Published Apr 29 • 92
MatMulfree LM Collection Pre-trined models for Matmulfree LM. • 4 items • Updated Jun 10, 2024 • 26
Agentless: Demystifying LLM-based Software Engineering Agents Paper • 2407.01489 • Published Jul 1, 2024 • 64
Transformers meet Neural Algorithmic Reasoners Paper • 2406.09308 • Published Jun 13, 2024 • 44
Block Transformer: Global-to-Local Language Modeling for Fast Inference Paper • 2406.02657 • Published Jun 4, 2024 • 41
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU Paper • 2403.06504 • Published Mar 11, 2024 • 55
MoAI: Mixture of All Intelligence for Large Language and Vision Models Paper • 2403.07508 • Published Mar 12, 2024 • 77
LongAlign: A Recipe for Long Context Alignment of Large Language Models Paper • 2401.18058 • Published Jan 31, 2024 • 22
AppAgent: Multimodal Agents as Smartphone Users Paper • 2312.13771 • Published Dec 21, 2023 • 54
LongNet: Scaling Transformers to 1,000,000,000 Tokens Paper • 2307.02486 • Published Jul 5, 2023 • 81
LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models Paper • 2308.16137 • Published Aug 30, 2023 • 40
LongNet: Scaling Transformers to 1,000,000,000 Tokens Paper • 2307.02486 • Published Jul 5, 2023 • 81 • 15