1 5 2

Lu Li

luli2949

AI & ML interests

None yet

Recent Activity

liked a Space 23 days ago

HuggingFaceTB/smol-training-playbook

authored a paper 26 days ago

STRICT: Stress Test of Rendering Images Containing Text

authored a paper 26 days ago

MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation

View all activity

Organizations

liked a Space 23 days ago

The Smol Training Playbook

📚

2.43k

The secrets to building world-class LLMs

authored 3 papers 26 days ago

STRICT: Stress Test of Rendering Images Containing Text

Paper • 2505.18985 • Published May 25

MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation

Paper • 2406.07529 • Published Jun 11, 2024

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published 27 days ago • 216

upvoted a paper 27 days ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published 27 days ago • 216

upvoted a collection 7 months ago

Context Clues

Collection

Models from the paper Context Clues • 16 items • Updated 18 days ago • 6

upvoted an article 9 months ago

Article

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

•

708

liked a dataset 11 months ago

danjacobellis/chexpert

Viewer • Updated Jul 18, 2024 • 224k • 844 • 12

upvoted a collection about 1 year ago

Qwen2

Collection

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Jul 21 • 373

authored a paper over 1 year ago

VCR: Visual Caption Restoration

Paper • 2406.06462 • Published Jun 10, 2024 • 13

upvoted a paper over 1 year ago