view article Article LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR By lightonai and 2 others • 3 days ago • 45
view article Article Understanding Gemma 3n: How MatFormer Gives You Many Models in One By rishiraj • Jun 26 • 48
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding Paper • 2506.16035 • Published Jun 19 • 88
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning Paper • 2505.16933 • Published May 22 • 34
view article Article Open-Source Handwritten Signature Detection Model By samuellimabraz • Mar 14 • 119
view article Article From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning By NormalUhr • Feb 4 • 16