|
|
--- |
|
|
library_name: transformers |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- meta-llama/Meta-Llama-3-8B-Instruct |
|
|
--- |
|
|
|
|
|
# Model Card for Model ID |
|
|
|
|
|
# Llama-3-8B-Hernia-Analyst-263-Patients |
|
|
|
|
|
This is a fine-tuned version of `meta-llama/Meta-Llama-3-8B-Instruct`, specialized for analyzing patient narratives related to Abdominal Wall Hernia (AWH). |
|
|
|
|
|
The model is designed to act as an "AI Research Assistant." It takes unstructured, free-text patient stories as input and transforms them into a structured JSON output. This output is based on a specific Quality of Life (QoL) framework derived from clinical research, notably the work published in *Hernia (2022) 26:795–808*. |
|
|
|
|
|
This model was trained as a proof-of-concept project on a dataset of 263 synthetic patients. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
The primary goal of this model is to automate the time-consuming process of qualitative analysis. It identifies key patient-reported outcomes across five core domains: |
|
|
- Body Image |
|
|
- Mental Health |
|
|
- Symptoms and Function |
|
|
- Interpersonal Relationships |
|
|
- Employment |
|
|
|
|
|
For each domain, the model performs a multi-level analysis, identifying the presence of the theme, its sentiment, key illustrative quotes, and the specific subthemes and clinical concepts mentioned by the patient. |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
This model is intended for **research and prototyping purposes only**. Its primary use is to process free-text patient narratives (e.g., from interview transcripts or questionnaires) and generate a structured, machine-readable analysis that can be used for cohort-level studies, data visualization, or to assist clinicians in quickly grasping the key QoL issues for a patient. |
|
|
|
|
|
**This is not a medical device.** The output should not be used for clinical diagnosis, treatment decisions, or any direct patient care without verification by a qualified healthcare professional. |
|
|
|
|
|
## How to Use |
|
|
|
|
|
The model requires the prompt to be formatted in the Llama 3 Instruct template. The following Python code shows how to load the model and run inference on a new patient narrative. |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
import json |
|
|
|
|
|
# Your model's unique ID on the Hugging Face Hub |
|
|
model_name = "Laxmikant17/Llama-3-8B-Hernia-Analyst-600-Patients" |
|
|
|
|
|
print(f"Loading fine-tuned model: {model_name}") |
|
|
|
|
|
# For running on a smaller GPU, it's recommended to load in 4-bit |
|
|
# from transformers import BitsAndBytesConfig |
|
|
# bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16) |
|
|
# model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config, device_map="auto") |
|
|
|
|
|
# For running on a larger GPU or CPU |
|
|
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto") |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model.eval() |
|
|
|
|
|
print("✅ Model loaded successfully!") |
|
|
|
|
|
# Prepare your test narrative |
|
|
test_narrative = """ |
|
|
The pain is the worst part. It's a constant, burning sensation that gets worse when I stand for more than ten minutes. I can't even lift my grocery bags without feeling a sharp pull. I also feel deformed. I avoid looking at myself without a shirt on. I just want to feel normal again. |
|
|
""" |
|
|
|
|
|
# Format the prompt using the exact Llama 3 Instruct template |
|
|
instruction = "Analyze the provided patient narrative about their experience with an Abdominal Wall Hernia (AWH) and generate a structured JSON output that summarizes your findings, adhering to the specified format and terminology." |
|
|
prompt = f"<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n{instruction}\n\n**Patient Narrative:**\n{test_narrative}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
|
|
|
# Generate the analysis |
|
|
print("\n🚀 Generating analysis...") |
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=4096, |
|
|
do_sample=False |
|
|
) |
|
|
|
|
|
decoded_output = tokenizer.decode(outputs, skip_special_tokens=True) |
|
|
|
|
|
# Robustly extract and print the JSON from the model's response |
|
|
try: |
|
|
assistant_response_start = decoded_output.find('assistant\n\n') |
|
|
response_part = decoded_output[assistant_response_start + len('assistant\n\n'):].strip() |
|
|
json_start = response_part.find('{') |
|
|
json_end = response_part.rfind('}') + 1 |
|
|
json_string = response_part[json_start:json_end] |
|
|
|
|
|
print("\n--- ✅ MODEL-GENERATED ANALYSIS ---") |
|
|
parsed_json = json.loads(json_string) |
|
|
print(json.dumps(parsed_json, indent=2)) |
|
|
except Exception as e: |
|
|
print(f"\n--- 🚨 ERROR: Could not parse the model's response. ---") |
|
|
print(f"Error: {e}") |
|
|
print("\nFull output for debugging:") |
|
|
print(decoded_output) |
|
|
``` |
|
|
## Training Data |
|
|
|
|
|
The model was fine-tuned on a dataset of **263 synthetic patient profiles**. |
|
|
|
|
|
The training data was generated via a two-step process: |
|
|
1. **Narrative Generation:** Patient profiles with specific metadata were created, and narratives were written to reflect the 5 key QoL domains. |
|
|
2. **Analysis Generation:** The target `output` JSON for each patient was generated by a powerful "teacher" model (`gemini-1.5-pro-latest`). This teacher model was guided by a highly detailed prompt that included the full QoL framework (domains, subthemes, and concepts) derived from the source research papers. This ensured the training data was high-quality, structured, and clinically relevant. |
|
|
|
|
|
## Training Procedure |
|
|
|
|
|
The model was fine-tuned using the QLoRA (Quantized Low-Rank Adaptation) technique for memory-efficient training. |
|
|
|
|
|
- **Frameworks:** `transformers`, `peft`, `bitsandbytes`, `trl` |
|
|
- **Base Model:** `meta-llama/Meta-Llama-3-8B-Instruct` |
|
|
- **Hardware:** Single NVIDIA T4 GPU via Google Colab. |
|
|
|
|
|
### Key Hyperparameters: |
|
|
- `learning_rate`: 2e-4 |
|
|
- `lora_r` (rank): 8 |
|
|
- `lora_alpha`: 16 |
|
|
- `num_train_epochs`: 1 |
|
|
- `per_device_train_batch_size`: 1 |
|
|
- `gradient_accumulation_steps`: 8 (Effective batch size: 8) |
|
|
- `optimizer`: paged_adamw_8bit |
|
|
- `lr_scheduler_type`: cosine |
|
|
|
|
|
## Limitations and Bias |
|
|
|
|
|
- **Synthetic Data Bias:** The model has only ever been trained on synthetic data. Its understanding of patient language is limited by the style and content of these narratives. It has **not** been validated on real-world patient text and may not perform as well on transcripts with different dialects, slang, or clinical complexity. |
|
|
- **Not a Medical Device:** This is a research prototype. It is not intended for and should not be used for making any form of clinical diagnosis or treatment decisions. |
|
|
- **Potential for Hallucination:** While trained to follow a strict format, the model may occasionally generate concepts or subthemes that are not in the official terminology list. All outputs should be reviewed by a human. |
|
|
- **Language:** The model was trained exclusively on English-language narratives. |
|
|
|
|
|
## Dependencies |
|
|
|
|
|
To run this model, you will need the following Python libraries. You can install them with `pip`. |
|
|
|
|
|
```text |
|
|
# requirements.txt |
|
|
|
|
|
transformers==4.40.1 |
|
|
datasets==2.18.0 |
|
|
accelerate==0.29.3 |
|
|
peft==0.10.0 |
|
|
bitsandbytes==0.43.0 |
|
|
trl==0.8.6 |
|
|
torch |
|
|
``` |
|
|
```python |
|
|
!pip uninstall -y sentence-transformers |
|
|
!pip install torch==2.3.1+cu121 torchvision==0.18.1+cu121 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121 |
|
|
!pip install -q "transformers==4.43.2" "datasets==2.18.0" "accelerate==0.29.3" "peft==0.10.0" "bitsandbytes==0.43.1" "trl==0.8.6" "protobuf==3.20.3" |
|
|
!pip install -q einops scipy sentencepiece tensorboard |
|
|
|
|
|
import os |
|
|
os.kill(os.getpid(), 9) |
|
|
``` |