1 41 12

Hammer++++

HammerW

AI & ML interests

None yet

Recent Activity

upvoted a paper 17 days ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

upvoted an article 22 days ago

mem-agent: Equipping LLM Agents with Memory Using RL

upvoted a paper 26 days ago

Large Reasoning Models Learn Better Alignment from Flawed Thinking

View all activity

Organizations

None yet

upvoted a paper 17 days ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published 23 days ago • 40

upvoted an article 22 days ago

Article

mem-agent: Equipping LLM Agents with Memory Using RL

and 1 other •

23 days ago

• 32

upvoted a paper 26 days ago

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Paper • 2510.00938 • Published about 1 month ago • 57

upvoted 2 papers 11 months ago

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 66

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Paper • 2411.10442 • Published Nov 15, 2024 • 86

upvoted a paper 12 months ago

IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization

Paper • 2411.06208 • Published Nov 9, 2024 • 21

upvoted 6 papers about 1 year ago

Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4, 2024 • 72

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3, 2024 • 78

upvoted an article about 1 year ago

Article

From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate

Jun 13, 2024

• 60

upvoted a paper about 1 year ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 133

upvoted an article about 1 year ago

Article

Tool Use, Unified

Aug 12, 2024

• 116

upvoted a paper over 1 year ago

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 116

upvoted an article over 1 year ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16, 2024

• 424

upvoted 3 papers over 1 year ago

Course-Correction: Safety Alignment Using Synthetic Preferences

Paper • 2407.16637 • Published Jul 23, 2024 • 26

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23, 2024 • 73

Towards Building Specialized Generalist AI with System 1 and System 2 Fusion

Paper • 2407.08642 • Published Jul 11, 2024 • 11

Hammer++++

AI & ML interests

Recent Activity

Organizations

HammerW's activity

mem-agent: Equipping LLM Agents with Memory Using RL

From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate

Tool Use, Unified

SmolLM - blazingly fast and remarkably powerful