Dl's picture

Dl

Dlbk

·

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

deepseek-ai/DeepSeek-V3.2-Exp

liked a model 2 months ago

xai-org/grok-2

liked a model 2 months ago

deepseek-ai/DeepSeek-V3.1-Base

View all activity

Organizations

upvoted 2 collections 3 months ago

NextStep-1

7 items • Updated Aug 18 • 27

AI Release Week Thread (21 July 2025)

AI Release Week Thread (21 July 2025) • 9 items • Updated 5 days ago • 2

upvoted a collection 4 months ago

Seed-X

A powerful open-source multilingual translation language model series, including instruction and reasoning models. • 8 items • Updated Aug 22 • 65

upvoted a collection 5 months ago

Gemma 3n Preview

4 items • Updated Jul 10 • 184

upvoted a collection 6 months ago

NextCoder

NextCoder family of code-editing LMs developed with Selective Knowledge Transfer and its training data. • 6 items • Updated Jul 9 • 71

upvoted an article 6 months ago

Article

Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time

By

and 4 others •

Feb 18

• 35

upvoted a collection 6 months ago

Qwen3

84 items • Updated Aug 6 • 1.38k

upvoted 2 collections 7 months ago

GLM-4-0414

GLM-4-0414 series model • 8 items • Updated Jun 30 • 131

Kimi-VL-A3B

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated 1 day ago • 75

upvoted a collection 9 months ago

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 15 items • Updated Apr 18 • 240

upvoted 2 articles 9 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.31k

Article

Open-R1: Update #1

By

and 7 others •

Feb 2

• 305

upvoted a collection 9 months ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 544

upvoted a paper 9 months ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21 • 66

upvoted 2 collections 10 months ago

OuteTTS

10 items • Updated Apr 7 • 17

OuteTTS 0.3

4 items • Updated Apr 7 • 17

upvoted a paper 10 months ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298

upvoted 2 collections 10 months ago

QwQ

Qwen with Questions • 6 items • Updated Jul 21 • 100

QVQ

QVQ: Qwen models for visual reasoning • 7 items • Updated Jul 21 • 52

upvoted a collection 11 months ago

Falcon3

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated Jul 23 • 86