--- library_name: rkllm pipeline_tag: text-generation license: other license_name: health-ai-developer-foundations license_link: https://developers.google.com/health-ai-developer-foundations/terms language: - en base_model: - google/medgemma-4b-it tags: - rkllm - rk3588 - rockchip - edge-ai - medical - pathology - physiology --- MedGemma-4B-IT — RKLLM build for RK3588 boards **Author:** @jamescallander **Source model:** [google/medgemma-4b-it · Hugging Face](https://huggingface.co/google/medgemma-4b-it) > This repository hosts a **conversion** of `MedGemma-4B-IT` for use on Rockchip RK3588 single-board computers (Orange Pi 5 plus, Radxa Rock 5b+, Banana Pi M7, etc.). Conversion was performed using the [RKNN-LLM toolkit](https://github.com/airockchip/rknn-llm?utm_source=chatgpt.com) #### Conversion details - RKLLM-Toolkit version: v1.2.1 - Python: 3.10 - Quantization: `w8a8_g128` - Output: single-file `.rkllm` artifact - Tokenizer: not required at runtime (UI handles prompt I/O) ## ⚠️ Safety disclaimer 🛑 **This model is not a substitute for professional medical advice, diagnosis, or treatment.** - It is intended for **research, educational, and experimental purposes only**. - Do **not** rely on its outputs for clinical or health-related decisions. - Always consult a qualified medical professional with any questions about a medical condition or treatment. - Use responsibly and in compliance with the [HAI-DEF Terms of Use](https://developers.google.com/health-ai-developer-foundations/terms?utm_source=chatgpt.com) ## Intended use - On-device deployment of **MedGemma-4B-IT**, tuned for biomedical/clinical instruction tasks, on RK3588 SBCs. - Useful for research, summarization of biomedical texts, and controlled experimentation in edge AI scenarios. ## Limitations - Requires 5GB free memory - Outputs are **not clinically validated** and may contain inaccuracies. - Quantized build (`w8a8_g128`) may show small quality differences vs. full-precision upstream. - Tested on Radxa Rock 5B+; other devices may require different drivers/toolkit versions. - Follow Google’s Gemma usage policies and AUP; additional domain restrictions may apply. [Google AI for Developers](https://ai.google.dev/gemma/terms?utm_source=chatgpt.com) ## Quick start (RK3588) ### 1) Install runtime The RKNN-LLM toolkit and instructions can be found on the specific development board's manufacturer website or from [airockchip's github page](https://github.com/airockchip). Download and install the required packages as per the toolkit's instructions. ### 2) Simple Flask server deployment The simplest way the deploy the `.rkllm` converted model is using an example script provided in the toolkit in this directory: `rknn-llm/examples/rkllm_server_demo` ```bash python3 /rknn-llm/examples/rkllm_server_demo/flask_server.py \ --rkllm_model_path /MedGemma-4B-IT_w8a8_g128_rk3588.rkllm \ --target_platform rk3588 ``` ### 3) Sending a request A basic format for message request is: ```json { "model":"MedGemma-4B-IT", "messages":[{ "role":"user", "content":""}], "stream":false } ``` Example request using `curl`: ```bash curl -s -X POST :8080/rkllm_chat \ -H 'Content-Type: application/json' \ -d '{"model":"MedGemma-4B-IT","messages":[{"role":"user","content":"What is a retrovirus?"}],"stream":false}' ``` The response is formated in the following way: ```json { "choices":[{ "finish_reason":"stop", "index":0, "logprobs":null, "message":{ "content":", "role":"assistant"}}], "created":null, "id":"rkllm_chat", "object":"rkllm_chat", "usage":{ "completion_tokens":null, "prompt_tokens":null, "total_tokens":null} } ``` Example response: ```json {"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"message":{"content":"A retrovirus is a type of virus that uses an enzyme called reverse transcriptase to convert its RNA genome into DNA, which then integrates into the host cell's DNA. This integration allows the virus to replicate within the host cell for extended periods, often leading to chronic infections or even cancer. Here's a breakdown: * **RNA Genome:** Retroviruses have an RNA (ribonucleic acid) genome instead of DNA. * **Reverse Transcriptase:** This enzyme is unique to retroviruses and allows them to convert their RNA into DNA. It essentially acts as a molecular "copy machine" for the virus's genetic material. * **Integration:** The newly synthesized DNA integrates into the host cell's genome, becoming part of its DNA. This integration can be permanent or semi-permanent. * **Replication:** Once integrated, the viral DNA can be transcribed back into RNA, which is then used to produce new virus particles. These particles are released from the infected cell, infecting other cells. **Key Characteristics and Implications:** * **Chronic Infections:** Because retroviruses integrate their genetic material into the host's genome, they can cause chronic infections that persist for years or even a lifetime. * **Cancer:** Retroviral integration can disrupt normal cellular processes, leading to uncontrolled cell growth and cancer development. * **Examples:** Some well-known examples of retroviruses include HIV (Human Immunodeficiency Virus), which causes AIDS, and HTLV (Human T-lymphotropic virus), which can cause adult T-cell leukemia. In summary, a retrovirus is a unique type of virus that uses reverse transcriptase to integrate its RNA genome into the host cell's DNA, leading to chronic infections and potential cancer development. ","role":"assistant"}}],"created":null,"id":"rkllm_chat","object":"rkllm_chat","usage":{"completion_tokens":null,"prompt_tokens":null,"total_tokens":null}} ``` ### 4) UI compatibility This server exposes an **OpenAI-compatible Chat Completions API**. You can connect it to any OpenAI-compatible client or UI (for example: [Open WebUI](https://github.com/open-webui/open-webui?utm_source=chatgpt.com)) - Configure your client with the API base: `http://:8080` and use the endpoint: `/rkllm_chat` - Make sure the `model` field matches the converted model’s name, for example: ```json { "model": "llama-3.1-8b-instruct", "messages": [{"role":"user","content":"Hello!"}], "stream": false } ``` # License This conversion follows the license of the source model: [Health AI Developer Foundations Terms of Use  |  Google for Developers](https://developers.google.com/health-ai-developer-foundations/terms). - **Attribution:** Built with MedGemma (Google Health AI Developer Foundations). - **Required notice:** see [`NOTICE`](NOTICE) For more information on the deployment and use of `.rkllm` models on RK3588 platforms, please refer to the RKNN-LLM toolkit documentatio