dinhanhx

dinhanhx

AI & ML interests

Vision Language

Recent Activity

liked a Space 15 days ago

nielsr/sam-3-lite-text-vs-sam-3

upvoted a paper 15 days ago

SAM3-LiteText: An Anatomical Study of the SAM3 Text Encoder for Efficient Vision-Language Segmentation

liked a model 17 days ago

ibm-granite/granite-embedding-311m-multilingual-r2

View all activity

Organizations

liked a Space 15 days ago

SAM-3 vs SAM-3-LiteText

🖼

Compare text‑guided image segmentation with two SAM‑3 models

upvoted a paper 15 days ago

SAM3-LiteText: An Anatomical Study of the SAM3 Text Encoder for Efficient Vision-Language Segmentation

Paper • 2602.12173 • Published Feb 12 • 3

liked a model 17 days ago

ibm-granite/granite-embedding-311m-multilingual-r2

Feature Extraction • 0.3B • Updated 2 days ago • 24.1k • • 85

liked 2 models about 1 month ago

nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16

Text Generation • Updated Mar 20 • 135k • 85

nvidia/NVIDIA-Nemotron-3-Nano-4B-FP8

Text Generation • 4B • Updated Mar 20 • 32.6k • 24

liked a Space about 1 month ago

Nemotron 3 Nano WebGPU

⚛

A compact reasoning-capable model running in your browser.

upvoted an article about 1 month ago

Article

Running Native PyTorch on TPUs with Zero Code Changes

rishiraj

•

Feb 21

• 6

upvoted a paper about 1 month ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20, 2025 • 164

upvoted an article about 2 months ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift

•

Apr 2

• 895

upvoted a collection about 2 months ago

jina-embeddings-v5-text

Collection

Our 5th-gen embeddings: two lightweight multilingual models with SOTA performance in retrieval, matching, clustering, and classification. • 29 items • Updated Feb 27 • 39

upvoted a collection 2 months ago

NVIDIA Nemotron v3

Collection

Open, Production-ready Enterprise Models • 18 items • Updated 1 day ago • 294

upvoted an article 3 months ago

Article

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

QuentinJG

•

Nov 5, 2025

• 64

liked 2 models 4 months ago

jinaai/jina-reranker-v2-base-multilingual

Text Ranking • 0.3B • Updated Oct 21, 2025 • 1.59M • 351

hantian/layoutreader

Token Classification • 0.4B • Updated Jun 8, 2025 • 163k • • 43

upvoted a collection 4 months ago

Contextual AI Reranker v2

Collection

Family of instruction-following multilingual rerankers on the cost/performance Pareto frontier across public and customer benchmarks • 9 items • Updated 28 days ago • 11

upvoted a paper 4 months ago

LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding

Paper • 2202.13669 • Published Feb 28, 2022 • 3

liked 2 models 4 months ago

nvidia/nemotron-ocr-v1

Image-to-Text • Updated Apr 1 • 93 • 119

pytorch/gemma-3-12b-it-FP8

Image-Text-to-Text • Updated Oct 16, 2025 • 45.6k • 1

liked a dataset 4 months ago

52100303-TranPhuocSang/hoidap-tvpl-500k

Viewer • Updated Feb 3, 2025 • 502k • 15 • 1

upvoted an article 4 months ago

Article

How We Built a Semantic Highlight Model To Save Token Cost for RAG

zilliz

•

Jan 15

• 67

dinhanhx

AI & ML interests

Recent Activity

Organizations

dinhanhx's activity

SAM-3 vs SAM-3-LiteText

Nemotron 3 Nano WebGPU

Running Native PyTorch on TPUs with Zero Code Changes

Welcome Gemma 4: Frontier multimodal intelligence on device

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

How We Built a Semantic Highlight Model To Save Token Cost for RAG