MedGemma-4B-IT — RKLLM build for RK3588 boards
Author: @jamescallander
Source model: google/medgemma-4b-it · Hugging Face
This repository hosts a conversion of
MedGemma-4B-ITfor use on Rockchip RK3588 single-board computers (Orange Pi 5 plus, Radxa Rock 5b+, Banana Pi M7, etc.). Conversion was performed using the RKNN-LLM toolkit
Conversion details
RKLLM-Toolkit version: v1.2.1
Python: 3.10
Quantization:
w8a8_g128Output: single-file
.rkllmartifactTokenizer: not required at runtime (UI handles prompt I/O)
⚠️ Safety disclaimer
🛑 This model is not a substitute for professional medical advice, diagnosis, or treatment.
It is intended for research, educational, and experimental purposes only.
Do not rely on its outputs for clinical or health-related decisions.
Always consult a qualified medical professional with any questions about a medical condition or treatment.
Use responsibly and in compliance with the HAI-DEF Terms of Use
Intended use
On-device deployment of MedGemma-4B-IT, tuned for biomedical/clinical instruction tasks, on RK3588 SBCs.
Useful for research, summarization of biomedical texts, and controlled experimentation in edge AI scenarios.
Limitations
- Requires 5GB free memory
- Outputs are not clinically validated and may contain inaccuracies.
- Quantized build (
w8a8_g128) may show small quality differences vs. full-precision upstream. - Tested on Radxa Rock 5B+; other devices may require different drivers/toolkit versions.
- Follow Google’s Gemma usage policies and AUP; additional domain restrictions may apply. Google AI for Developers
Quick start (RK3588)
1) Install runtime
The RKNN-LLM toolkit and instructions can be found on the specific development board's manufacturer website or from airockchip's github page.
Download and install the required packages as per the toolkit's instructions.
2) Simple Flask server deployment
The simplest way the deploy the .rkllm converted model is using an example script provided in the toolkit in this directory: rknn-llm/examples/rkllm_server_demo
python3 <TOOLKIT_PATH>/rknn-llm/examples/rkllm_server_demo/flask_server.py \
--rkllm_model_path <MODEL_PATH>/MedGemma-4B-IT_w8a8_g128_rk3588.rkllm \
--target_platform rk3588
3) Sending a request
A basic format for message request is:
{
"model":"MedGemma-4B-IT",
"messages":[{
"role":"user",
"content":"<YOUR_PROMPT_HERE>"}],
"stream":false
}
Example request using curl:
curl -s -X POST <SERVER_IP_ADDRESS>:8080/rkllm_chat \
-H 'Content-Type: application/json' \
-d '{"model":"MedGemma-4B-IT","messages":[{"role":"user","content":"What is a retrovirus?"}],"stream":false}'
The response is formated in the following way:
{
"choices":[{
"finish_reason":"stop",
"index":0,
"logprobs":null,
"message":{
"content":"<MODEL_REPLY_HERE">,
"role":"assistant"}}],
"created":null,
"id":"rkllm_chat",
"object":"rkllm_chat",
"usage":{
"completion_tokens":null,
"prompt_tokens":null,
"total_tokens":null}
}
Example response:
{"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"message":{"content":"A retrovirus is a type of virus that uses an enzyme called reverse transcriptase to convert its RNA genome into DNA, which then integrates into the host cell's DNA. This integration allows the virus to replicate within the host cell for extended periods, often leading to chronic infections or even cancer. Here's a breakdown: * **RNA Genome:** Retroviruses have an RNA (ribonucleic acid) genome instead of DNA. * **Reverse Transcriptase:** This enzyme is unique to retroviruses and allows them to convert their RNA into DNA. It essentially acts as a molecular "copy machine" for the virus's genetic material. * **Integration:** The newly synthesized DNA integrates into the host cell's genome, becoming part of its DNA. This integration can be permanent or semi-permanent. * **Replication:** Once integrated, the viral DNA can be transcribed back into RNA, which is then used to produce new virus particles. These particles are released from the infected cell, infecting other cells. **Key Characteristics and Implications:** * **Chronic Infections:** Because retroviruses integrate their genetic material into the host's genome, they can cause chronic infections that persist for years or even a lifetime. * **Cancer:** Retroviral integration can disrupt normal cellular processes, leading to uncontrolled cell growth and cancer development. * **Examples:** Some well-known examples of retroviruses include HIV (Human Immunodeficiency Virus), which causes AIDS, and HTLV (Human T-lymphotropic virus), which can cause adult T-cell leukemia. In summary, a retrovirus is a unique type of virus that uses reverse transcriptase to integrate its RNA genome into the host cell's DNA, leading to chronic infections and potential cancer development. ","role":"assistant"}}],"created":null,"id":"rkllm_chat","object":"rkllm_chat","usage":{"completion_tokens":null,"prompt_tokens":null,"total_tokens":null}}
4) UI compatibility
This server exposes an OpenAI-compatible Chat Completions API.
You can connect it to any OpenAI-compatible client or UI (for example: Open WebUI)
- Configure your client with the API base:
http://<SERVER_IP_ADDRESS>:8080and use the endpoint:/rkllm_chat - Make sure the
modelfield matches the converted model’s name, for example:
{
"model": "llama-3.1-8b-instruct",
"messages": [{"role":"user","content":"Hello!"}],
"stream": false
}
License
This conversion follows the license of the source model:
Health AI Developer Foundations Terms of Use | Google for Developers.
Attribution: Built with MedGemma (Google Health AI Developer Foundations).
Required notice: see
NOTICE
For more information on the deployment and use of .rkllm models on RK3588 platforms, please refer to the RKNN-LLM toolkit documentatio
- Downloads last month
- 3