Improve model card: Add pipeline tag, library name, abstract, and detailed overview

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +40 -2
README.md CHANGED
@@ -1,15 +1,49 @@
1
  ---
2
- license: mit
3
  base_model:
4
  - Qwen/Qwen2.5-VL-7B-Instruct
5
  language:
6
  - en
 
 
 
7
  ---
8
 
9
  # QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training
10
 
11
  This repository contains the model weights for QoQ-Med-VL-7B (Qwen Omni-Reasoning on Medical Questions), a multimodal clinical foundation model with reasoning capabilities.
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ## Model Weights
14
 
15
  | Model | Weights | Avg. Val Accuracy |
@@ -164,4 +198,8 @@ If you find the project useful, please cite the following papers:
164
  journal={arXiv preprint arXiv:2506.00711},
165
  year={2025}
166
  }
167
- ```
 
 
 
 
 
1
  ---
 
2
  base_model:
3
  - Qwen/Qwen2.5-VL-7B-Instruct
4
  language:
5
  - en
6
+ license: mit
7
+ library_name: transformers
8
+ pipeline_tag: image-text-to-text
9
  ---
10
 
11
  # QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training
12
 
13
  This repository contains the model weights for QoQ-Med-VL-7B (Qwen Omni-Reasoning on Medical Questions), a multimodal clinical foundation model with reasoning capabilities.
14
 
15
+ πŸ“š [Paper](https://huggingface.co/papers/2506.00711) | πŸ’» [Code](https://github.com/DDVD233/QoQ_Med)
16
+
17
+ ## Abstract
18
+ Clinical decision-making routinely demands reasoning over heterogeneous data, yet existing multimodal language models (MLLMs) remain largely vision-centric and fail to generalize across clinical specialties. To bridge this gap, we introduce QoQ-Med-7B/32B, the first open generalist clinical foundation model that jointly reasons across medical images, time-series signals, and text reports. QoQ-Med is trained with Domain-aware Relative Policy Optimization (DRPO), a novel reinforcement-learning objective that hierarchically scales normalized rewards according to domain rarity and modality difficulty, mitigating performance imbalance caused by skewed clinical data distributions. Trained on 2.61 million instruction tuning pairs spanning 9 clinical domains, we show that DRPO training boosts diagnostic performance by 43% in macro-F1 on average across all visual domains as compared to other critic-free training methods like GRPO. Furthermore, with QoQ-Med trained on intensive segmentation data, it is able to highlight salient regions related to the diagnosis, with an IoU 10x higher than open models while reaching the performance of OpenAI o4-mini. To foster reproducibility and downstream research, we release (i) the full model weights, (ii) the modular training pipeline, and (iii) all intermediate reasoning traces at this https URL .
19
+
20
+ ## Model Overview
21
+ ![QoQ-Med Model Overview](https://github.com/DDVD233/QoQ_Med/blob/main/images/model_training.jpeg?raw=true)
22
+
23
+ QoQ-Med is the first open generalist clinical foundation model that jointly reasons across:
24
+ - Medical images (2D/3D)
25
+ - Time-series signals (ECG)
26
+ - Text reports
27
+
28
+ The model is trained with our novel Domain-aware Relative Policy Optimization (DRPO), a reinforcement learning objective that hierarchically scales normalized rewards according to domain rarity and modality difficulty, addressing performance imbalance in heterogeneous clinical data.
29
+
30
+ ## Key Features
31
+
32
+ - **Multimodal Integration**: Processes and reasons across 1D, 2D, and 3D clinical data
33
+ - **Domain-Aware Training**: DRPO balances learning across 9 clinical domains
34
+ - **Enhanced Interpretability**: Generates reasoning traces and highlights salient regions
35
+ - **State-of-the-Art Performance**: Outperforms existing open-source clinical MLLMs
36
+
37
+ ## Clinical Domains
38
+
39
+ QoQ-Med spans multiple clinical specialties:
40
+ - Cardiology (ECG, Chest X-ray)
41
+ - Radiology (CT, MRI, Ultrasound)
42
+ - Dermatology
43
+ - Ophthalmology (Fundus)
44
+ - Pathology
45
+ - Mammography
46
+
47
  ## Model Weights
48
 
49
  | Model | Weights | Avg. Val Accuracy |
 
198
  journal={arXiv preprint arXiv:2506.00711},
199
  year={2025}
200
  }
201
+ ```
202
+
203
+ ## Important Note
204
+
205
+ This model is intended for research purposes only. Before extensive real-world testing (like human trials), it is not suitable for clinical deployment. This is a research preview, not a product approved by federal agencies.