Patrick (Tsung-Han) Wu's picture

3 11 6

Patrick (Tsung-Han) Wu

tsunghanwu

·

https://patrickthwu.com/

AI & ML interests

Vision and Language

Recent Activity

liked a dataset 10 days ago

dynamic-lm/update-interrupt-benchmark

authored a paper 13 days ago

Are Large Reasoning Models Interruptible?

View all activity

Organizations

upvoted 3 papers 5 months ago

Search Arena: Analyzing Search-Augmented LLMs

Paper • 2506.05334 • Published Jun 5 • 17

REOrdering Patches Improves Vision Models

Paper • 2505.23751 • Published May 29 • 15

Puzzled by Puzzles: When Vision-Language Models Can't Take a Hint

Paper • 2505.23759 • Published May 29 • 5

upvoted 3 papers 6 months ago

Learning Adaptive Parallel Reasoning with Language Models

Paper • 2504.15466 • Published Apr 21 • 43

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22 • 63

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Paper • 2504.13169 • Published Apr 17 • 39

upvoted a paper 7 months ago

TULIP: Towards Unified Language-Image Pretraining

Paper • 2503.15485 • Published Mar 19 • 48

upvoted a collection 9 months ago

Visual Haystacks

Official datasets and checkpoints of the paper -- Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark (ICLR 2025) • 4 items • Updated Apr 18 • 2

upvoted a paper 11 months ago

VisionArena: 230K Real World User-VLM Conversations with Preference Labels

Paper • 2412.08687 • Published Dec 11, 2024 • 13

upvoted an article about 1 year ago

Article

Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark!

By

•

Jul 23, 2024

• 3

upvoted a paper about 1 year ago

CLAIR-A: Leveraging Large Language Models to Judge Audio Captions

Paper • 2409.12962 • Published Sep 19, 2024 • 2