UMA-Robots (UMA)

AdilZtn

authored a paper 2 months ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14 • 118

mshukor

authored 2 papers 5 months ago

Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning

Paper • 2310.00647 • Published Oct 1, 2023

Scaling Laws for Optimal Data Mixtures

Paper • 2507.09404 • Published Jul 12 • 36

cadene

authored a paper 7 months ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 147

aliberts

authored a paper 7 months ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 147

AdilZtn

authored a paper 7 months ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 147

mshukor

authored a paper 7 months ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 147

mshukor

authored 3 papers 8 months ago

Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

Paper • 2306.04488 • Published Jun 7, 2023 • 2

Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs

Paper • 2405.16700 • Published May 26, 2024

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

Paper • 2504.07951 • Published Apr 10 • 30

mshukor

authored a paper about 1 year ago

Multimodal Autoregressive Pre-training of Large Vision Encoders

Paper • 2411.14402 • Published Nov 21, 2024 • 47

AdilZtn

authored 5 papers about 1 year ago

Look where you look! Saliency-guided Q-networks for generalization in visual Reinforcement Learning

Paper • 2209.09203 • Published Sep 16, 2022

Leyo

authored 2 papers over 1 year ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 133

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 103

Leyo

authored a paper almost 2 years ago

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

Paper • 2403.09029 • Published Mar 14, 2024 • 56

mshukor

authored a paper over 2 years ago

eP-ALM: Efficient Perceptual Augmentation of Language Models

Paper • 2303.11403 • Published Mar 20, 2023 • 3

AI & ML interests

Team members 9

UMA-Robots's activity