9 17 7

Han Zhao

han1997

https://h-zhao1997.github.io

AI & ML interests

Robotics, reinforcement learning, large language model, AGI

Recent Activity

authored a paper 9 days ago

QUAR-VLA: Vision-Language-Action Model for Quadruped Robots

authored a paper 9 days ago

PiTe: Pixel-Temporal Alignment for Large Video-Language Model

authored a paper 9 days ago

OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation

View all activity

Organizations

upvoted a paper 10 days ago

VLA^2: Empowering Vision-Language-Action Models with an Agentic Framework for Unseen Concept Manipulation

Paper • 2510.14902 • Published 11 days ago • 13

upvoted a paper 25 days ago

VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators

Paper • 2510.00406 • Published 27 days ago • 63

upvoted a paper about 2 months ago

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

Paper • 2509.09372 • Published Sep 11 • 233

upvoted 14 papers over 1 year ago

TextGrad: Automatic "Differentiation" via Text

Paper • 2406.07496 • Published Jun 11, 2024 • 31

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31, 2024 • 67

TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models

Paper • 2404.09204 • Published Apr 14, 2024 • 11

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Paper • 2404.08801 • Published Apr 12, 2024 • 66

HGRN2: Gated Linear RNNs with State Expansion

Paper • 2404.07904 • Published Apr 11, 2024 • 20

BRAVE: Broadening the visual encoding of vision-language models

Paper • 2404.07204 • Published Apr 10, 2024 • 19

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10, 2024 • 111

Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models

Paper • 2404.04478 • Published Apr 6, 2024 • 13

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 107

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Paper • 2404.02905 • Published Apr 3, 2024 • 74

Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference

Paper • 2403.14520 • Published Mar 21, 2024 • 35

Han Zhao

AI & ML interests

Recent Activity

Organizations

han1997's activity