Tarka Embed V1 Collection Efficient DFKD embeddings for language understanding • 4 items • Updated 10 days ago • 5
view article Article LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR 30 days ago • 60
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv about 1 month ago • 129
view article Article Llama‑Embed‑Nemotron‑8B Text Embedding Model Ranks First on Multilingual MTEB Leaderboard Oct 21 • 13
view article Article Australian-made LLM beats OpenAI and Google at legal retrieval about 1 month ago • 25
Open Legal Data Collection A collection of our favorite open-source legal datasets on Hugging Face. • 2 items • Updated 22 days ago • 4
👁️ LFM2-VL Collection LFM2-VL is our first series of vision-language models, designed for on-device deployment. • 9 items • Updated 23 days ago • 51
view article Article Introducing MTEB v2: Evaluation of embedding and retrieval systems for more than just text Oct 20 • 33
view article Article Google Cloud C4 Brings a 70% TCO improvement on GPT OSS with Intel and Hugging Face Oct 16 • 18
Fantastic (small) Retrievers and How to Train Them: mxbai-edge-colbert-v0 Tech Report Paper • 2510.14880 • Published Oct 16 • 17
view article Article Granite Embedding R2: Setting New Standards for Enterprise Retrieval Oct 14 • 15