nielsr HF Staff commited on
Commit
2e37e71
·
verified ·
1 Parent(s): b33355b

Improve model card: update pipeline tag and add paper link

Browse files

This PR updates the model card by:

- changing the `pipeline_tag` to `robotics`, ensuring people find your model at https://huggingface.co/models?pipeline_tag=robotics.
- adding a link to the paper's Hugging Face papers URL.

Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  library_name: transformers
3
- pipeline_tag: image-text-to-text
4
  license: mit
 
5
  ---
6
 
7
  # Model Card for Magma-8B
@@ -180,7 +180,8 @@ image = image.convert("RGB")
180
 
181
  convs = [
182
  {"role": "system", "content": "You are agent that can see, talk and act."},
183
- {"role": "user", "content": "<image_start><image><image_end>\nWhat is in this image?"},
 
184
  ]
185
  prompt = processor.tokenizer.apply_chat_template(convs, tokenize=False, add_generation_prompt=True)
186
  inputs = processor(images=[image], texts=prompt, return_tensors="pt")
@@ -222,7 +223,7 @@ Our training data consists of:
222
 
223
  * Robotics Manipulation Data: [Open-X-Embodiment](https://robotics-transformer-x.github.io/).
224
 
225
- * UI Grounding Data: [SeeClick](https://github.com/njucckevin/SeeClick).
226
 
227
  * UI Navigation Data: [Mind2web](https://osu-nlp-group.github.io/Mind2Web/) and [AITW](https://github.com/google-research/google-research/tree/master/android_in_the_wild).
228
 
@@ -473,14 +474,13 @@ For the robotic manipulation task, some mitigation strategies to use for human s
473
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
474
 
475
  ```bibtex
476
- @misc{yang2025magmafoundationmodelmultimodal,
477
- title={Magma: A Foundation Model for Multimodal AI Agents},
478
- author={Jianwei Yang and Reuben Tan and Qianhui Wu and Ruijie Zheng and Baolin Peng and Yongyuan Liang and Yu Gu and Mu Cai and Seonghyeon Ye and Joel Jang and Yuquan Deng and Lars Liden and Jianfeng Gao},
479
- year={2025},
480
- eprint={2502.13130},
481
- archivePrefix={arXiv},
482
- primaryClass={cs.CV},
483
- url={https://arxiv.org/abs/2502.13130},
484
  }
485
  ```
486
  <!-- {{ citation_bibtex | default("[More Information Needed]", true)}} -->
 
1
  ---
2
  library_name: transformers
 
3
  license: mit
4
+ pipeline_tag: robotics
5
  ---
6
 
7
  # Model Card for Magma-8B
 
180
 
181
  convs = [
182
  {"role": "system", "content": "You are agent that can see, talk and act."},
183
+ {"role": "user", "content": "<image_start><image><image_end>
184
+ What is in this image?"},
185
  ]
186
  prompt = processor.tokenizer.apply_chat_template(convs, tokenize=False, add_generation_prompt=True)
187
  inputs = processor(images=[image], texts=prompt, return_tensors="pt")
 
223
 
224
  * Robotics Manipulation Data: [Open-X-Embodiment](https://robotics-transformer-x.github.io/).
225
 
226
+ * UI Grounding Data: [SeeClick](https://github.com/njucckevin/SeeClick).\
227
 
228
  * UI Navigation Data: [Mind2web](https://osu-nlp-group.github.io/Mind2Web/) and [AITW](https://github.com/google-research/google-research/tree/master/android_in_the_wild).
229
 
 
474
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
475
 
476
  ```bibtex
477
+ @misc{yang2025magmafoundationmodelmultimodal,\
478
+ title={Magma: A Foundation Model for Multimodal AI Agents}, \
479
+ author={Jianwei Yang and Reuben Tan and Qianhui Wu and Ruijie Zheng and Baolin Peng and Yongyuan Liang and Yu Gu and Mu Cai and Seonghyeon Ye and Joel Jang and Yuquan Deng and Lars Liden and Jianfeng Gao},\
480
+ year={2025},\
481
+ eprint={2502.13130},\
482
+ archivePrefix={arXiv},\
483
+ url={https://arxiv.org/abs/2502.13130}, \
 
484
  }
485
  ```
486
  <!-- {{ citation_bibtex | default("[More Information Needed]", true)}} -->