7 21 54

web

dim

dmitrymailk

AI & ML interests

dimweb, LM/LLM pronouns

Recent Activity

upvoted a paper 16 days ago

When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA

upvoted a paper 26 days ago

Optimal Scaling Needs Optimal Norm

updated a model about 1 month ago

dim/dls_speech_2025

View all activity

Organizations

upvoted a paper 16 days ago

When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA

Paper • 2510.04849 • Published 27 days ago • 110

upvoted a paper 26 days ago

Optimal Scaling Needs Optimal Norm

Paper • 2510.03871 • Published 29 days ago • 28

upvoted a paper 3 months ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108

upvoted an article 3 months ago

Article

Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training

Aug 8

• 76

upvoted 2 articles 4 months ago

Article

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Jul 9

• 699

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

• 707

upvoted a paper 5 months ago

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28 • 43

upvoted 2 papers 7 months ago

TULIP: Towards Unified Language-Image Pretraining

Paper • 2503.15485 • Published Mar 19 • 49

PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

Paper • 2504.08791 • Published Apr 7 • 136

upvoted an article 8 months ago

Article

FastRTC: The Real-Time Communication Library for Python

Feb 25

• 172

upvoted a paper 9 months ago

Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

Paper • 2502.13063 • Published Feb 18 • 72

upvoted an article 9 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 884

upvoted 6 papers over 1 year ago

Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models

Paper • 2407.12327 • Published Jul 17, 2024 • 79

Associative Recurrent Memory Transformer

Paper • 2407.04841 • Published Jul 5, 2024 • 36

Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Paper • 2406.14213 • Published Jun 20, 2024 • 21

nabla^2DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials

Paper • 2406.14347 • Published Jun 20, 2024 • 102

The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing

Paper • 2406.10601 • Published Jun 15, 2024 • 70

BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack

Paper • 2406.10149 • Published Jun 14, 2024 • 52

upvoted an article over 1 year ago

Article

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 707

upvoted a collection about 2 years ago

Instruct datasets in Russian

Collection

All datasets have been translated using Google Translate • 14 items • Updated Mar 10 • 8

web

AI & ML interests

Recent Activity

Organizations

dim's activity

Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

SmolLM3: smol, multilingual, long-context reasoner

FastRTC: The Real-Time Communication Library for Python

Open-R1: a fully open reproduction of DeepSeek-R1

Uncensor any LLM with abliteration