Laxmikant17's picture
Update README.md
3ac5130 verified
---
library_name: transformers
license: apache-2.0
language:
- en
base_model:
- meta-llama/Meta-Llama-3-8B-Instruct
---
# Model Card for Model ID
# Llama-3-8B-Hernia-Analyst-263-Patients
This is a fine-tuned version of `meta-llama/Meta-Llama-3-8B-Instruct`, specialized for analyzing patient narratives related to Abdominal Wall Hernia (AWH).
The model is designed to act as an "AI Research Assistant." It takes unstructured, free-text patient stories as input and transforms them into a structured JSON output. This output is based on a specific Quality of Life (QoL) framework derived from clinical research, notably the work published in *Hernia (2022) 26:795–808*.
This model was trained as a proof-of-concept project on a dataset of 263 synthetic patients.
## Model Description
The primary goal of this model is to automate the time-consuming process of qualitative analysis. It identifies key patient-reported outcomes across five core domains:
- Body Image
- Mental Health
- Symptoms and Function
- Interpersonal Relationships
- Employment
For each domain, the model performs a multi-level analysis, identifying the presence of the theme, its sentiment, key illustrative quotes, and the specific subthemes and clinical concepts mentioned by the patient.
## Intended Use
This model is intended for **research and prototyping purposes only**. Its primary use is to process free-text patient narratives (e.g., from interview transcripts or questionnaires) and generate a structured, machine-readable analysis that can be used for cohort-level studies, data visualization, or to assist clinicians in quickly grasping the key QoL issues for a patient.
**This is not a medical device.** The output should not be used for clinical diagnosis, treatment decisions, or any direct patient care without verification by a qualified healthcare professional.
## How to Use
The model requires the prompt to be formatted in the Llama 3 Instruct template. The following Python code shows how to load the model and run inference on a new patient narrative.
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import json
# Your model's unique ID on the Hugging Face Hub
model_name = "Laxmikant17/Llama-3-8B-Hernia-Analyst-600-Patients"
print(f"Loading fine-tuned model: {model_name}")
# For running on a smaller GPU, it's recommended to load in 4-bit
# from transformers import BitsAndBytesConfig
# bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16)
# model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config, device_map="auto")
# For running on a larger GPU or CPU
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
model.eval()
print("✅ Model loaded successfully!")
# Prepare your test narrative
test_narrative = """
The pain is the worst part. It's a constant, burning sensation that gets worse when I stand for more than ten minutes. I can't even lift my grocery bags without feeling a sharp pull. I also feel deformed. I avoid looking at myself without a shirt on. I just want to feel normal again.
"""
# Format the prompt using the exact Llama 3 Instruct template
instruction = "Analyze the provided patient narrative about their experience with an Abdominal Wall Hernia (AWH) and generate a structured JSON output that summarizes your findings, adhering to the specified format and terminology."
prompt = f"<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n{instruction}\n\n**Patient Narrative:**\n{test_narrative}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate the analysis
print("\n🚀 Generating analysis...")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=4096,
do_sample=False
)
decoded_output = tokenizer.decode(outputs, skip_special_tokens=True)
# Robustly extract and print the JSON from the model's response
try:
assistant_response_start = decoded_output.find('assistant\n\n')
response_part = decoded_output[assistant_response_start + len('assistant\n\n'):].strip()
json_start = response_part.find('{')
json_end = response_part.rfind('}') + 1
json_string = response_part[json_start:json_end]
print("\n--- ✅ MODEL-GENERATED ANALYSIS ---")
parsed_json = json.loads(json_string)
print(json.dumps(parsed_json, indent=2))
except Exception as e:
print(f"\n--- 🚨 ERROR: Could not parse the model's response. ---")
print(f"Error: {e}")
print("\nFull output for debugging:")
print(decoded_output)
```
## Training Data
The model was fine-tuned on a dataset of **263 synthetic patient profiles**.
The training data was generated via a two-step process:
1. **Narrative Generation:** Patient profiles with specific metadata were created, and narratives were written to reflect the 5 key QoL domains.
2. **Analysis Generation:** The target `output` JSON for each patient was generated by a powerful "teacher" model (`gemini-1.5-pro-latest`). This teacher model was guided by a highly detailed prompt that included the full QoL framework (domains, subthemes, and concepts) derived from the source research papers. This ensured the training data was high-quality, structured, and clinically relevant.
## Training Procedure
The model was fine-tuned using the QLoRA (Quantized Low-Rank Adaptation) technique for memory-efficient training.
- **Frameworks:** `transformers`, `peft`, `bitsandbytes`, `trl`
- **Base Model:** `meta-llama/Meta-Llama-3-8B-Instruct`
- **Hardware:** Single NVIDIA T4 GPU via Google Colab.
### Key Hyperparameters:
- `learning_rate`: 2e-4
- `lora_r` (rank): 8
- `lora_alpha`: 16
- `num_train_epochs`: 1
- `per_device_train_batch_size`: 1
- `gradient_accumulation_steps`: 8 (Effective batch size: 8)
- `optimizer`: paged_adamw_8bit
- `lr_scheduler_type`: cosine
## Limitations and Bias
- **Synthetic Data Bias:** The model has only ever been trained on synthetic data. Its understanding of patient language is limited by the style and content of these narratives. It has **not** been validated on real-world patient text and may not perform as well on transcripts with different dialects, slang, or clinical complexity.
- **Not a Medical Device:** This is a research prototype. It is not intended for and should not be used for making any form of clinical diagnosis or treatment decisions.
- **Potential for Hallucination:** While trained to follow a strict format, the model may occasionally generate concepts or subthemes that are not in the official terminology list. All outputs should be reviewed by a human.
- **Language:** The model was trained exclusively on English-language narratives.
## Dependencies
To run this model, you will need the following Python libraries. You can install them with `pip`.
```text
# requirements.txt
transformers==4.40.1
datasets==2.18.0
accelerate==0.29.3
peft==0.10.0
bitsandbytes==0.43.0
trl==0.8.6
torch
```
```python
!pip uninstall -y sentence-transformers
!pip install torch==2.3.1+cu121 torchvision==0.18.1+cu121 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
!pip install -q "transformers==4.43.2" "datasets==2.18.0" "accelerate==0.29.3" "peft==0.10.0" "bitsandbytes==0.43.1" "trl==0.8.6" "protobuf==3.20.3"
!pip install -q einops scipy sentencepiece tensorboard
import os
os.kill(os.getpid(), 9)
```