Vaibhav Singh

veb-101

veb-101

AI & ML interests

None yet

Recent Activity

upvoted an article 6 days ago

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

upvoted a paper 23 days ago

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

liked a model about 1 month ago

jinaai/jina-embeddings-v4-vllm-retrieval

View all activity

Organizations

None yet

upvoted an article 6 days ago

Article

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

and 2 others •

7 days ago

• 52

upvoted a paper 23 days ago

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Paper • 2510.00515 • Published 29 days ago • 39

upvoted a collection about 1 month ago

jina-embeddings-v4

Collection

Universal Embeddings for Multimodal Multilingual Retrieval • 10 items • Updated Sep 2 • 2

upvoted 2 papers about 1 month ago

Lost in Embeddings: Information Loss in Vision-Language Models

Paper • 2509.11986 • Published Sep 15 • 27

Color Me Correctly: Bridging Perceptual Color Spaces and Text Embeddings for Improved Diffusion Generation

Paper • 2509.10058 • Published Sep 12 • 11

upvoted a paper 2 months ago

STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

Paper • 2508.10893 • Published Aug 14 • 31

upvoted 2 papers 3 months ago

MolmoAct: Action Reasoning Models that can Reason in Space

Paper • 2508.07917 • Published Aug 11 • 43

Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off

Paper • 2508.04825 • Published Aug 6 • 57

upvoted an article 3 months ago

Article

Efficient MultiModal Data Pipeline

Jul 8

• 58

upvoted a paper 4 months ago

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

Paper • 2506.15681 • Published Jun 18 • 39

upvoted an article 4 months ago

Article

🪆 Introduction to Matryoshka Embedding Models

Feb 23, 2024

• 178

upvoted a paper 7 months ago

Personalize Anything for Free with Diffusion Transformer

Paper • 2503.12590 • Published Mar 16 • 44

upvoted a paper 9 months ago

The Curse of Depth in Large Language Models

Paper • 2502.05795 • Published Feb 9 • 40

upvoted a paper 12 months ago

Scaling Properties of Diffusion Models for Perceptual Tasks

Paper • 2411.08034 • Published Nov 12, 2024 • 13

upvoted a collection 12 months ago

Cosmos-Tokenizer

Collection

A suite of image and video tokenizers • 13 items • Updated 9 days ago • 41

upvoted a paper about 1 year ago

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Paper • 2410.02073 • Published Oct 2, 2024 • 41

upvoted an article about 1 year ago

Article

Welcome FalconMamba: The first strong attention-free 7B model

Aug 12, 2024

• 113

upvoted a collection over 1 year ago

MobileNetV4 pretrained weights

Collection

Weights for MobileNet-V4 pretrained in timm • 17 items • Updated Sep 19 • 20

upvoted 2 papers over 1 year ago

DiTFastAttn: Attention Compression for Diffusion Transformer Models

Paper • 2406.08552 • Published Jun 12, 2024 • 25

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Paper • 2405.18392 • Published May 28, 2024 • 12

Vaibhav Singh

AI & ML interests

Recent Activity

Organizations

veb-101's activity

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

Efficient MultiModal Data Pipeline

🪆 Introduction to Matryoshka Embedding Models

Welcome FalconMamba: The first strong attention-free 7B model