Improve model card: Add pipeline tag and library name (#1)

Browse files

- Improve model card: Add pipeline tag and library name (5a1ca879fcdc3b9123421f457a7f74a6253201e6)

Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show

README.md +6 -11

README.md CHANGED Viewed

@@ -1,6 +1,9 @@
 ---
 license: apache-2.0
 ---
 ### UI-Venus
 This repository contains the UI-Venus model from the report [UI-Venus: Building High-performance UI Agents with RFT](https://arxiv.org/abs/2508.10833). UI-Venus is a native UI agent based on the Qwen2.5-VL multimodal large language model, designed to perform precise GUI element grounding and effective navigation using only screenshots as input. It achieves state-of-the-art performance through Reinforcement Fine-Tuning (RFT) with high-quality training data. More inference details and usage guides are available in the GitHub repository. We will continue to update results on standard benchmarks including Screenspot-v2/Pro and AndroidWorld.
@@ -34,12 +37,8 @@ Key innovations include:
 - **Efficient Data Cleaning**: Trained on several hundred thousand high-quality samples to ensure robustness.
 - **Self-Evolving Trajectory History Alignment & Sparse Action Enhancement**: Improves reasoning coherence and action distribution for better long-horizon planning.
 ---
-##  Installation
 First, install the required dependencies:
@@ -48,9 +47,7 @@ pip install transformers==4.49.0 qwen-vl-utils
 ```
 ---
-##  Quick Start
 ```python
 from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor
 from typing import Dict, Tuple, Any
@@ -227,7 +224,6 @@ This is the compressed package of validation trajectories for **AndroidWorld**,
 > **Table:** Performance comparison on **AndroidWorld** for end-to-end models. Our UI-Venus-Navi-72B achieves state-of-the-art performance, outperforming all baseline methods across different settings.
 ### Results on AndroidControl and GUI-Odyssey
 | Models | AndroidControl-Low<br>Type Acc. | AndroidControl-Low<br>Step SR | AndroidControl-High<br>Type Acc. | AndroidControl-High<br>Step SR | GUI-Odyssey<br>Type Acc. | GUI-Odyssey<br>Step SR |
@@ -253,7 +249,6 @@ This is the compressed package of validation trajectories for **AndroidWorld**,
 > **Table:** Performance comparison on offline UI navigation datasets including AndroidControl and GUI-Odyssey. Note that models with * are reproduced.
 # Citation
 Please consider citing if you find our work useful:
 ```plain
@@ -266,4 +261,4 @@ Please consider citing if you find our work useful:
       primaryClass={cs.CV},
       url={https://arxiv.org/abs/2508.10833},
 }
-```

 ---
 license: apache-2.0
+pipeline_tag: image-text-to-text
+library_name: transformers
 ---
 ### UI-Venus
 This repository contains the UI-Venus model from the report [UI-Venus: Building High-performance UI Agents with RFT](https://arxiv.org/abs/2508.10833). UI-Venus is a native UI agent based on the Qwen2.5-VL multimodal large language model, designed to perform precise GUI element grounding and effective navigation using only screenshots as input. It achieves state-of-the-art performance through Reinforcement Fine-Tuning (RFT) with high-quality training data. More inference details and usage guides are available in the GitHub repository. We will continue to update results on standard benchmarks including Screenspot-v2/Pro and AndroidWorld.
 - **Efficient Data Cleaning**: Trained on several hundred thousand high-quality samples to ensure robustness.
 - **Self-Evolving Trajectory History Alignment & Sparse Action Enhancement**: Improves reasoning coherence and action distribution for better long-horizon planning.
 ---
+## Installation
 First, install the required dependencies:
 ```
 ---
+## Quick Start
 ```python
 from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor
 from typing import Dict, Tuple, Any
 > **Table:** Performance comparison on **AndroidWorld** for end-to-end models. Our UI-Venus-Navi-72B achieves state-of-the-art performance, outperforming all baseline methods across different settings.
 ### Results on AndroidControl and GUI-Odyssey
 | Models | AndroidControl-Low<br>Type Acc. | AndroidControl-Low<br>Step SR | AndroidControl-High<br>Type Acc. | AndroidControl-High<br>Step SR | GUI-Odyssey<br>Type Acc. | GUI-Odyssey<br>Step SR |
 > **Table:** Performance comparison on offline UI navigation datasets including AndroidControl and GUI-Odyssey. Note that models with * are reproduced.
 # Citation
 Please consider citing if you find our work useful:
 ```plain
       primaryClass={cs.CV},
       url={https://arxiv.org/abs/2508.10833},
 }
+```