10 23 29

Mengzhao Chen

ChenMnZ

https://chenmnz.github.io/

ChenMnZ

AI & ML interests

model compression

Recent Activity

upvoted a paper 1 day ago

Parallel Loop Transformer for Efficient Test-Time Computation Scaling

upvoted a paper 11 days ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

upvoted a paper 17 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Parallel Loop Transformer for Efficient Test-Time Computation Scaling

Paper • 2510.24824 • Published 3 days ago • 12

upvoted a paper 11 days ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published 14 days ago • 85

upvoted a paper 17 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published 18 days ago • 168

upvoted a paper about 1 month ago

Seedream 4.0: Toward Next-generation Multimodal Image Generation

Paper • 2509.20427 • Published Sep 24 • 76

upvoted an article 2 months ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

•

Oct 7, 2024

• 55

liked a dataset 2 months ago

togethercomputer/Long-Data-Collections

Viewer • Updated Jan 4 • 4.12M • 463 • 155

upvoted 2 papers 5 months ago

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

Paper • 2505.16933 • Published May 22 • 34

Scaling Diffusion Transformers Efficiently via μP

Paper • 2505.15270 • Published May 21 • 35

authored 3 papers 5 months ago

upvoted a paper 5 months ago

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20 • 76

commented a paper 5 months ago

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20 • 76 •

upvoted a paper 5 months ago

Emerging Properties in Unified Multimodal Pretraining

Paper • 2505.14683 • Published May 20 • 134

commented a paper 5 months ago

Model Merging in Pre-training of Large Language Models

Paper • 2505.12082 • Published May 17 • 40 •

upvoted a paper 5 months ago

Model Merging in Pre-training of Large Language Models

Paper • 2505.12082 • Published May 17 • 40

upvoted a paper 6 months ago

DanceGRPO: Unleashing GRPO on Visual Generation

Paper • 2505.07818 • Published May 12 • 32

authored a paper 6 months ago

DanceGRPO: Unleashing GRPO on Visual Generation

Paper • 2505.07818 • Published May 12 • 32

liked a model 6 months ago

mlfoundations/scaling

Updated Mar 15, 2024 • 4

liked a model 8 months ago

nvidia/DeepSeek-R1-FP4

Text Generation • Updated Jun 6 • 7.69k • 264

Mengzhao Chen

AI & ML interests

Recent Activity

Organizations

ChenMnZ's activity

Efficient LLM Pretraining: Packed Sequences and Masked Attention