Sean Ma

seanmamasde

Seanmamasde

AI & ML interests

None yet

Recent Activity

upvoted an article 4 days ago

Illustrating Reinforcement Learning from Human Feedback (RLHF)

liked a model 6 days ago

cisco-ai/cisco-time-series-model-1.0-preview

upvoted a collection 7 days ago

DeepSeek-V3.2

View all activity

Organizations

None yet

upvoted an article 4 days ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Dec 9, 2022

•

380

upvoted a collection 7 days ago

DeepSeek-V3.2

Collection

4 items • Updated 15 days ago • 508

upvoted an article 11 days ago

Article

We Got Claude to Fine-Tune an Open Source LLM

12 days ago

•

501

upvoted an article 14 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

15 days ago

•

240

upvoted a paper 18 days ago

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Paper • 2511.16043 • Published 26 days ago • 106

upvoted a paper 19 days ago

GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms

Paper • 2511.17592 • Published 29 days ago • 118

upvoted an article 20 days ago

Article

Common AI Model Formats

Feb 27

•

upvoted a paper 25 days ago

Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

Paper • 2511.14993 • Published 27 days ago • 222

upvoted a paper about 1 month ago

VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

Paper • 2511.02778 • Published Nov 4 • 101

upvoted an article about 1 month ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

•

737

upvoted a paper 2 months ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

upvoted an article 3 months ago

Article

Smol2Operator: Post-Training GUI Agents for Computer Use

Sep 23

•

133

upvoted 2 papers 3 months ago

Reverse-Engineered Reasoning for Open-Ended Generation

Paper • 2509.06160 • Published Sep 7 • 149

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 193

upvoted an article 3 months ago

Article

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

Jul 29

•

203

upvoted 3 collections 4 months ago

upvoted 2 papers 4 months ago

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14 • 97

DINOv3

Paper • 2508.10104 • Published Aug 13 • 285

Sean Ma

AI & ML interests

Recent Activity

Organizations

seanmamasde's activity

Illustrating Reinforcement Learning from Human Feedback (RLHF)

We Got Claude to Fine-Tune an Open Source LLM

Transformers v5: Simple model definitions powering the AI ecosystem

Common AI Model Formats

SmolLM3: smol, multilingual, long-context reasoner

Smol2Operator: Post-Training GUI Agents for Computer Use

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face