BioLLama LLM Adapters
Model Description
BioLLama LLM Adapters are lightweight, parameter-efficient fine-tuning (PEFT) weights designed to enhance the clinical reasoning capabilities of the Llama-3 architecture.
These adapters were trained using QLoRA (Quantized Low-Rank Adaptation) on the ContactDoctor Bio-Medical Llama-3.2-1B base model. The primary objective of this fine-tuning is to improve Chain-of-Thought (CoT) generation for medical diagnostics and question answering, prioritizing logical step-by-step derivation over direct answer prediction.
Technical Specifications
| Configuration | Details |
|---|---|
| Base Model | ContactDoctor/Bio-Medical-Llama-3-2-1B-CoT-012025 |
| Architecture | Llama 3.2 (1B parameters) |
| Adaptation Method | LoRA (Low-Rank Adaptation) |
| Quantization | 4-bit (NF4) via bitsandbytes |
| Target Modules | Attention Projections (q_proj, v_proj) |
| LoRA Rank (r) | 16 |
| LoRA Alpha | 32 |
| Training Epochs | 3 |
Performance and Evaluation
The model was evaluated on the MedMCQA validation set and a curated subset of NEET PG 2024 (National Eligibility cum Entrance Test for Post-Graduation) clinical scenario questions.
| Metric | Score | Notes |
|---|---|---|
| NEET PG Clinical Subset | 72.7% | Zero-shot accuracy on text-based clinical reasoning questions. |
| Validation Accuracy | 40.0% | MedMCQA validation split. |
| Inference Mode | Greedy Decoding | Evaluated without sampling to ensure deterministic outputs. |
Usage
Prerequisites
To use these adapters, ensure peft, transformers, and bitsandbytes are installed.
pip install transformers peft torch bitsandbytes accelerate
Inference Pipeline
The following script demonstrates how to load the base model and apply the BioLLama adapters. Python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
BASE_MODEL_ID = "ContactDoctor/Bio-Medical-Llama-3-2-1B-CoT-012025"
ADAPTER_ID = "calender/BioLLama-LLM-Adapters"
def load_inference_model():
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_ID)
base_model = AutoModelForCausalLM.from_pretrained(
BASE_MODEL_ID,
device_map="auto",
torch_dtype=torch.float16,
)
model = PeftModel.from_pretrained(base_model, ADAPTER_ID)
return model, tokenizer
model, tokenizer = load_inference_model()
query = "A 45-year-old presents with fatigue and low hemoglobin. Suggest initial line of management."
inputs = tokenizer(query, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.1,
do_sample=False # Deterministic for medical queries
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Limitations and Disclaimer
Research Use Only: This model is intended for academic research and development purposes. It is not a certified medical device.
Clinical Decision Making: The outputs of this model should not be used for direct patient care, diagnosis, or treatment planning without verification by a qualified healthcare professional.
Hallucinations: As with all Large Language Models, this model may generate plausible-sounding but factually incorrect medical information.
Citation
If you utilize this work, please cite it as follows: Code snippet
@misc{calendar2025biollama, title = {BioLLama LLM Adapters: Fine-tuned Medical Reasoning System}, author = {Calendar, S.}, year = {2025}, publisher = {Hugging Face}, url = {https://huggingface.co/calender/BioLLama-LLM-Adapters} }
- Downloads last month
- 1
Model tree for calender/BioLLama-LLM-Adapters
Base model
meta-llama/Llama-3.2-1B-Instruct