nielsr HF Staff commited on
Commit
d6db471
·
verified ·
1 Parent(s): 86d718e

Improve model card: Add Github link and more tags

Browse files

This PR adds an explicit link to the GitHub repository in the model card content for easier navigation. It also enhances the model's metadata by adding relevant tags (`llava`, `reasoning`, `vqa`) to better describe its functionality and improve discoverability on the Hugging Face Hub.

Files changed (1) hide show
  1. README.md +12 -5
README.md CHANGED
@@ -1,14 +1,19 @@
1
  ---
2
- license: apache-2.0
3
- language:
4
- - en
5
  base_model:
6
  - meta-llama/Llama-3.2-11B-Vision-Instruct
7
  datasets:
8
  - Xkev/LLaVA-CoT-100k
9
- pipeline_tag: image-text-to-text
 
10
  library_name: transformers
 
 
 
 
 
 
11
  ---
 
12
  # Model Card for Model ID
13
 
14
  <!-- Provide a quick summary of what the model is/does. -->
@@ -24,6 +29,8 @@ The model was proposed in [LLaVA-CoT: Let Vision Language Models Reason Step-by-
24
  - **License:** apache-2.0
25
  - **Finetuned from model:** meta-llama/Llama-3.2-11B-Vision-Instruct
26
 
 
 
27
  ## Benchmark Results
28
 
29
  | MMStar | MMBench | MMVet | MathVista | AI2D | Hallusion | Average |
@@ -95,5 +102,5 @@ Using the same setting should accurately reproduce our results.
95
 
96
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
97
 
98
- The model may generate biased or offensive content, similar to other VLMs, due to limitations in the training data.
99
  Technically, the model's performance in aspects like instruction following still falls short of leading industry models.
 
1
  ---
 
 
 
2
  base_model:
3
  - meta-llama/Llama-3.2-11B-Vision-Instruct
4
  datasets:
5
  - Xkev/LLaVA-CoT-100k
6
+ language:
7
+ - en
8
  library_name: transformers
9
+ license: apache-2.0
10
+ pipeline_tag: image-text-to-text
11
+ tags:
12
+ - llava
13
+ - reasoning
14
+ - vqa
15
  ---
16
+
17
  # Model Card for Model ID
18
 
19
  <!-- Provide a quick summary of what the model is/does. -->
 
29
  - **License:** apache-2.0
30
  - **Finetuned from model:** meta-llama/Llama-3.2-11B-Vision-Instruct
31
 
32
+ **Code:** [https://github.com/PKU-YuanGroup/LLaVA-CoT](https://github.com/PKU-YuanGroup/LLaVA-CoT)
33
+
34
  ## Benchmark Results
35
 
36
  | MMStar | MMBench | MMVet | MathVista | AI2D | Hallusion | Average |
 
102
 
103
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
104
 
105
+ The model may generate biased or offensive content, similar to other VLMs, due to limitations in the training data.
106
  Technically, the model's performance in aspects like instruction following still falls short of leading industry models.