Kai Zuberbühler

kaizuberbuehler

k-zubi

AI & ML interests

language models, agents, image generation, music generation

Recent Activity

updated a collection about 1 month ago

Synthetic Data and Self-Improvement

updated a collection about 1 month ago

Reasoning, Thinking, RL and Test-Time Scaling

upvoted a paper about 1 month ago

Process Reward Models That Think

View all activity

Organizations

None yet

upvoted 20 papers about 1 month ago

Process Reward Models That Think

Paper • 2504.16828 • Published Apr 23 • 18

Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations

Paper • 2504.13816 • Published Apr 18 • 18

Generative AI Act II: Test Time Scaling Drives Cognition Engineering

Paper • 2504.13828 • Published Apr 18 • 18

LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs

Paper • 2504.14655 • Published Apr 20 • 20

Efficient Pretraining Length Scaling

Paper • 2504.14992 • Published Apr 21 • 20

WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

Paper • 2504.15785 • Published Apr 22 • 20

LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities

Paper • 2504.16078 • Published Apr 22 • 21

IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs

Paper • 2504.15415 • Published Apr 21 • 22

QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining

Paper • 2504.16511 • Published Apr 23 • 22

EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models

Paper • 2504.15133 • Published Apr 21 • 25

Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs

Paper • 2504.15280 • Published Apr 21 • 25

AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset

Paper • 2504.16891 • Published Apr 23 • 25

THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models

Paper • 2504.13367 • Published Apr 17 • 26

Could Thinking Multilingually Empower LLM Reasoning?

Paper • 2504.11833 • Published Apr 16 • 29

UFO2: The Desktop AgentOS

Paper • 2504.14603 • Published Apr 20 • 29

BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation

Paper • 2504.14538 • Published Apr 20 • 30

Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation

Paper • 2504.17207 • Published Apr 24 • 30

X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents

Paper • 2504.13203 • Published Apr 15 • 34

OTC: Optimal Tool Calls via Reinforcement Learning

Paper • 2504.14870 • Published Apr 21 • 35

PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models

Paper • 2504.16074 • Published Apr 22 • 36

Kai Zuberbühler

AI & ML interests

Recent Activity

Organizations

kaizuberbuehler's activity