Xing Yun's picture

Xing Yun

xing0047

·

xing0047

AI & ML interests

Computer Vision

Recent Activity

updated a dataset 8 days ago

xing0047/lvr_data

published a dataset 8 days ago

xing0047/lvr_data

upvoted a paper 16 days ago

Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs

View all activity

Organizations

upvoted a paper 16 days ago

Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs

Paper • 2605.00814 • Published 22 days ago • 21

upvoted a paper 30 days ago

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published about 1 month ago • 240

upvoted a paper about 2 months ago

ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models

Paper • 2603.19466 • Published Mar 19 • 41

upvoted a paper 3 months ago

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 523

upvoted 10 papers 4 months ago

SAMTok: Representing Any Mask with Two Words

Paper • 2601.16093 • Published Jan 22 • 43

HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding

Paper • 2601.14724 • Published Jan 21 • 75

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 204

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Paper • 2601.02427 • Published Jan 4 • 46

Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model

Paper • 2510.12276 • Published Oct 14, 2025 • 149

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

Paper • 2510.05034 • Published Oct 6, 2025 • 51

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6, 2025 • 132

SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 137

Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 162

Adaptation of Agentic AI

Paper • 2512.16301 • Published Dec 18, 2025 • 108

upvoted a paper 5 months ago

Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation

Paper • 2512.13495 • Published Dec 15, 2025 • 11

upvoted 3 papers 6 months ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 242

MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

Paper • 2511.09611 • Published Nov 12, 2025 • 71

DeepEyesV2: Toward Agentic Multimodal Model

Paper • 2511.05271 • Published Nov 7, 2025 • 46

upvoted a paper 7 months ago

Emu3.5: Native Multimodal Models are World Learners

Paper • 2510.26583 • Published Oct 30, 2025 • 115

upvoted a paper 8 months ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 514