Ougrid Dumdang

Ougrid-D

ougrid

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 hours ago

ARGenSeg: Image Segmentation with Autoregressive Image Generation Model

upvoted a paper 1 day ago

Unified Reinforcement and Imitation Learning for Vision-Language Models

upvoted a paper 11 days ago

RAG-Anything: All-in-One RAG Framework

View all activity

Organizations

upvoted a paper about 2 hours ago

ARGenSeg: Image Segmentation with Autoregressive Image Generation Model

Paper • 2510.20803 • Published 5 days ago • 8

upvoted a paper 1 day ago

Unified Reinforcement and Imitation Learning for Vision-Language Models

Paper • 2510.19307 • Published 6 days ago • 24

upvoted a paper 11 days ago

RAG-Anything: All-in-One RAG Framework

Paper • 2510.12323 • Published 14 days ago • 42

upvoted a paper about 1 month ago

LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence

Paper • 2509.12203 • Published Sep 15 • 19

liked a model 2 months ago

loolootech/no-name-ner-th

Token Classification • 0.3B • Updated Aug 20 • 31 • 5

upvoted 2 papers 2 months ago

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1 • 236

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 254

liked a Space 2 months ago

172

Chat with Kimi-VL-A3B-Thinking-2506

🤔

Chat with images, videos, or PDFs to generate text

upvoted 2 papers 2 months ago

A Survey on Diffusion Language Models

Paper • 2508.10875 • Published Aug 14 • 34

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

Paper • 2508.05954 • Published Aug 8 • 6

liked 2 models 3 months ago

kpsss34/Stable-Diffusion-3.5-Small-Preview1

Text-to-Image • Updated Aug 13 • 972 • 38

Qwen/Qwen3-4B-Thinking-2507

Text Generation • 4B • Updated Aug 6 • 299k • • 437

upvoted an article 3 months ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

Aug 5

• 503

liked a model 3 months ago

Qwen/Qwen-Image

Text-to-Image • Updated Aug 18 • 195k • • 2.15k

upvoted 5 articles 3 months ago

Article

TimeScope: How Long Can Your Video Large Multimodal Model Go?

Jul 23

• 46

Article

Five Big Improvements to Gradio MCP Servers

Jul 17

• 24

Article

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Jul 9

• 697

Article

Asynchronous Robot Inference: Decoupling Action Prediction and Execution

Jul 10

• 43

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

Jun 3

• 268

upvoted a paper 3 months ago

Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models

Paper • 2507.07104 • Published Jul 9 • 45

Ougrid Dumdang

AI & ML interests

Recent Activity

Organizations

Ougrid-D's activity

Chat with Kimi-VL-A3B-Thinking-2506

Welcome GPT OSS, the new open-source model family from OpenAI!

TimeScope: How Long Can Your Video Large Multimodal Model Go?

Five Big Improvements to Gradio MCP Servers

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Asynchronous Robot Inference: Decoupling Action Prediction and Execution

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data