Sdeerk

AI & ML interests

None yet

Recent Activity

liked a Space 13 days ago

HuggingFaceFW/blogpost-fineweb-v1

liked a model 14 days ago

PaddlePaddle/PaddleOCR-VL

liked a model about 1 month ago

baidu/ERNIE-4.5-21B-A3B-Thinking

View all activity

Organizations

liked a Space 13 days ago

1.12k

FineWeb: decanting the web for the finest text data at scale

🍷

Generate high-quality text data for LLMs using FineWeb

liked a model 14 days ago

PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated about 18 hours ago • 24.6k • 1.18k

liked a model about 1 month ago

baidu/ERNIE-4.5-21B-A3B-Thinking

Text Generation • 22B • Updated 12 days ago • 978 • • 759

upvoted an article 2 months ago

Article

Vision Language Models (Better, Faster, Stronger)

May 12

• 557

liked 2 datasets 3 months ago

Jofthomas/hermes-function-calling-thinking-V1

Viewer • Updated Feb 16 • 3.57k • 659 • 69

NousResearch/hermes-function-calling-v1

Viewer • Updated Aug 30, 2024 • 11.6k • 1.62k • 347

upvoted a paper 3 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 306

liked a Space 4 months ago

Awesome O1 R1

💻

[Keep updating]Collect everything about o1 and r1!

upvoted 2 articles 4 months ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 947

Article

Vision Language Models Explained

Apr 11, 2024

• 479

updated a model 4 months ago

baidu/ERNIE-4.5-21B-A3B-Base-Paddle

Text Generation • 22B • Updated Aug 20 • 68 • 10

liked a Space 4 months ago

3.37k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a dataset 4 months ago

openai/gsm8k

Viewer • Updated Jan 4, 2024 • 17.6k • 435k • 925

upvoted a collection 4 months ago

ERNIE 4.5

Collection

collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 26 items • Updated Sep 24 • 174

updated a model 4 months ago

baidu/ERNIE-4.5-21B-A3B-Paddle

Text Generation • 22B • Updated Sep 9 • 73 • 12

liked a dataset 5 months ago

K-and-K/knights-and-knaves

Viewer • Updated Oct 31, 2024 • 6.9k • 990 • 34

upvoted 3 articles 7 months ago

Article

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Aug 22, 2022

• 9

Article

How to generate text: using different decoding methods for language generation with Transformers

Mar 1, 2020

• 253

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

May 24, 2023

• 168

Sdeerk

AI & ML interests

Recent Activity

Organizations

Sdeerk's activity

FineWeb: decanting the web for the finest text data at scale

Vision Language Models (Better, Faster, Stronger)

Awesome O1 R1

Mixture of Experts Explained

Vision Language Models Explained

The Ultra-Scale Playbook

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

How to generate text: using different decoding methods for language generation with Transformers

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA