Latent Reasoning in LLMs as a Vocabulary-Space Superposition Paper • 2510.15522 • Published 10 days ago • 1
Interpreting Language Models Through Concept Descriptions: A Survey Paper • 2510.01048 • Published 25 days ago • 2
The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability? Paper • 2507.08802 • Published Jul 11 • 1
view article Article There is no such thing as a tokenizer-free lunch By catherinearnett • Sep 25 • 84
RelP: Faithful and Efficient Circuit Discovery via Relevance Patching Paper • 2508.21258 • Published Aug 28 • 3
view article Article Exploring Environments Hub: Your Language Model needs better (open) environments to learn By anakin87 • Sep 4 • 27
Apertus LLM Collection Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages • 4 items • Updated 25 days ago • 292
CRISP: Persistent Concept Unlearning via Sparse Autoencoders Paper • 2508.13650 • Published Aug 19 • 15
Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors Paper • 2505.11770 • Published May 17 • 2
Persona Vectors: Monitoring and Controlling Character Traits in Language Models Paper • 2507.21509 • Published Jul 29 • 32
Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning Paper • 2507.16795 • Published Jul 22 • 2
Monet: Mixture of Monosemantic Experts for Transformers Paper • 2412.04139 • Published Dec 5, 2024 • 14
🥨 Bavarian NLP Papers Collection Awesome papers about Bavarian NLP • 11 items • Updated 17 days ago • 2
view article Article Bringing Fusion Down to Earth: ML for Stellarator Optimization By cgeorgiaw • Jul 2 • 74