Update README.md (#2)
Browse files- Update README.md (6d1c08facc89a3700f747e7feed14574ca9fe2eb)
Co-authored-by: Jiwoong Sohn <[email protected]>
README.md
CHANGED
|
@@ -11,7 +11,7 @@ pipeline_tag: text-generation
|
|
| 11 |
---
|
| 12 |
|
| 13 |
# Med-PRM-Reward (Version 1.0)
|
| 14 |
-
🚀 Med-PRM-Reward is among the first Process Reward Models (PRMs) specifically designed for the medical domain. Unlike conventional PRMs, it enhances its verification capabilities by integrating clinical knowledge through retrieval-augmented generation (RAG). Med-PRM-Reward demonstrates exceptional performance in scaling-test-time computation, particularly outperforming majority‐voting ensembles on complex medical reasoning tasks. Moreover, its scalability is not limited to Llama-3.1-8B-Instruct: it delivers similarly outstanding results in scaling-test-time computation across multiple other medical‐specialized models. Notably, when combined with llama-3-meerkat-8b-v1.0, it became the first
|
| 15 |
|
| 16 |
📄 Paper: [Med-PRM-Reward: Medical Reasoning Models with Stepwise, Guideline‑verified Process Rewards](https://huggingface.co/papers/2506.11474)
|
| 17 |
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
# Med-PRM-Reward (Version 1.0)
|
| 14 |
+
🚀 Med-PRM-Reward is among the first Process Reward Models (PRMs) specifically designed for the medical domain. Unlike conventional PRMs, it enhances its verification capabilities by integrating clinical knowledge through retrieval-augmented generation (RAG). Med-PRM-Reward demonstrates exceptional performance in scaling-test-time computation, particularly outperforming majority‐voting ensembles on complex medical reasoning tasks. Moreover, its scalability is not limited to Llama-3.1-8B-Instruct: it delivers similarly outstanding results in scaling-test-time computation across multiple other medical‐specialized models. Notably, when combined with llama-3-meerkat-8b-v1.0, it became the first 8B model framework to surpass a score of 80 on the MedQA (4-option) benchmark.
|
| 15 |
|
| 16 |
📄 Paper: [Med-PRM-Reward: Medical Reasoning Models with Stepwise, Guideline‑verified Process Rewards](https://huggingface.co/papers/2506.11474)
|
| 17 |
|