4 23

Julius

acjulius

AI & ML interests

None yet

Recent Activity

commented on a paper 21 days ago

Apriel-1.5-15b-Thinker

new activity 22 days ago

mvp-lab/LLaVA-OneVision-1.5-Instruct-Data:some image is None

new activity 22 days ago

mvp-lab/LLaVA-OneVision-1.5-Instruct-Data:Subset shows no pictures but there is "<image>" in the dialogue.

View all activity

Organizations

commented a paper 21 days ago

Apriel-1.5-15b-Thinker

Paper • 2510.01141 • Published 26 days ago • 113 •

New activity in mvp-lab/LLaVA-OneVision-1.5-Instruct-Data 22 days ago

some image is None

#5 opened 22 days ago by

acjulius

Subset shows no pictures but there is "<image>" in the dialogue.

#3 opened 30 days ago by

kira66arik

upvoted a paper 6 months ago

Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

Paper • 2504.17432 • Published Apr 24 • 39

upvoted 5 papers 8 months ago

MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning

Paper • 2502.19634 • Published Feb 26 • 63

upvoted 5 papers 11 months ago

Towards Universal Soccer Video Understanding

Paper • 2412.01820 • Published Dec 2, 2024 • 13

MRGen: Diffusion-based Controllable Data Engine for MRI Segmentation towards Unannotated Modalities

Paper • 2412.04106 • Published Dec 4, 2024 • 6

Efficient Track Anything

Paper • 2411.18933 • Published Nov 28, 2024 • 17

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

Paper • 2411.15296 • Published Nov 22, 2024 • 21

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Paper • 2411.13281 • Published Nov 20, 2024 • 21

commented a paper 12 months ago

RetrieveGPT: Merging Prompts and Mathematical Models for Enhanced Code-Mixed Information Retrieval

Paper • 2411.04752 • Published Nov 7, 2024 • 17 •

upvoted a paper about 1 year ago

SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs

Paper • 2408.11813 • Published Aug 21, 2024 • 12

upvoted 2 papers over 1 year ago

EVLM: An Efficient Vision-Language Model for Visual Understanding

Paper • 2407.14177 • Published Jul 19, 2024 • 44

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10, 2024 • 72

upvoted a collection over 1 year ago

InternVL2.0

Collection

Expanding Performance Boundaries of Open-Source MLLM • 15 items • Updated 30 days ago • 89

upvoted a paper over 1 year ago

What Matters in Detecting AI-Generated Videos like Sora?

Paper • 2406.19568 • Published Jun 27, 2024 • 16