Model Card for Model ID

Llama-3-8B-Hernia-Analyst-263-Patients

This is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct, specialized for analyzing patient narratives related to Abdominal Wall Hernia (AWH).

The model is designed to act as an "AI Research Assistant." It takes unstructured, free-text patient stories as input and transforms them into a structured JSON output. This output is based on a specific Quality of Life (QoL) framework derived from clinical research, notably the work published in Hernia (2022) 26:795–808.

This model was trained as a proof-of-concept project on a dataset of 263 synthetic patients.

Model Description

The primary goal of this model is to automate the time-consuming process of qualitative analysis. It identifies key patient-reported outcomes across five core domains:

  • Body Image
  • Mental Health
  • Symptoms and Function
  • Interpersonal Relationships
  • Employment

For each domain, the model performs a multi-level analysis, identifying the presence of the theme, its sentiment, key illustrative quotes, and the specific subthemes and clinical concepts mentioned by the patient.

Intended Use

This model is intended for research and prototyping purposes only. Its primary use is to process free-text patient narratives (e.g., from interview transcripts or questionnaires) and generate a structured, machine-readable analysis that can be used for cohort-level studies, data visualization, or to assist clinicians in quickly grasping the key QoL issues for a patient.

This is not a medical device. The output should not be used for clinical diagnosis, treatment decisions, or any direct patient care without verification by a qualified healthcare professional.

How to Use

The model requires the prompt to be formatted in the Llama 3 Instruct template. The following Python code shows how to load the model and run inference on a new patient narrative.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import json

# Your model's unique ID on the Hugging Face Hub
model_name = "Laxmikant17/Llama-3-8B-Hernia-Analyst-600-Patients"

print(f"Loading fine-tuned model: {model_name}")

# For running on a smaller GPU, it's recommended to load in 4-bit
# from transformers import BitsAndBytesConfig
# bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16)
# model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config, device_map="auto")

# For running on a larger GPU or CPU
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
model.eval()

print("βœ… Model loaded successfully!")

# Prepare your test narrative
test_narrative = """
The pain is the worst part. It's a constant, burning sensation that gets worse when I stand for more than ten minutes. I can't even lift my grocery bags without feeling a sharp pull. I also feel deformed. I avoid looking at myself without a shirt on. I just want to feel normal again.
"""

# Format the prompt using the exact Llama 3 Instruct template
instruction = "Analyze the provided patient narrative about their experience with an Abdominal Wall Hernia (AWH) and generate a structured JSON output that summarizes your findings, adhering to the specified format and terminology."
prompt = f"<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n{instruction}\n\n**Patient Narrative:**\n{test_narrative}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate the analysis
print("\nπŸš€ Generating analysis...")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=4096,
        do_sample=False
    )

decoded_output = tokenizer.decode(outputs, skip_special_tokens=True)

# Robustly extract and print the JSON from the model's response
try:
    assistant_response_start = decoded_output.find('assistant\n\n')
    response_part = decoded_output[assistant_response_start + len('assistant\n\n'):].strip()
    json_start = response_part.find('{')
    json_end = response_part.rfind('}') + 1
    json_string = response_part[json_start:json_end]

    print("\n--- βœ… MODEL-GENERATED ANALYSIS ---")
    parsed_json = json.loads(json_string)
    print(json.dumps(parsed_json, indent=2))
except Exception as e:
    print(f"\n--- 🚨 ERROR: Could not parse the model's response. ---")
    print(f"Error: {e}")
    print("\nFull output for debugging:")
    print(decoded_output)

Training Data

The model was fine-tuned on a dataset of 263 synthetic patient profiles.

The training data was generated via a two-step process:

  1. Narrative Generation: Patient profiles with specific metadata were created, and narratives were written to reflect the 5 key QoL domains.
  2. Analysis Generation: The target output JSON for each patient was generated by a powerful "teacher" model (gemini-1.5-pro-latest). This teacher model was guided by a highly detailed prompt that included the full QoL framework (domains, subthemes, and concepts) derived from the source research papers. This ensured the training data was high-quality, structured, and clinically relevant.

Training Procedure

The model was fine-tuned using the QLoRA (Quantized Low-Rank Adaptation) technique for memory-efficient training.

  • Frameworks: transformers, peft, bitsandbytes, trl
  • Base Model: meta-llama/Meta-Llama-3-8B-Instruct
  • Hardware: Single NVIDIA T4 GPU via Google Colab.

Key Hyperparameters:

  • learning_rate: 2e-4
  • lora_r (rank): 8
  • lora_alpha: 16
  • num_train_epochs: 1
  • per_device_train_batch_size: 1
  • gradient_accumulation_steps: 8 (Effective batch size: 8)
  • optimizer: paged_adamw_8bit
  • lr_scheduler_type: cosine

Limitations and Bias

  • Synthetic Data Bias: The model has only ever been trained on synthetic data. Its understanding of patient language is limited by the style and content of these narratives. It has not been validated on real-world patient text and may not perform as well on transcripts with different dialects, slang, or clinical complexity.
  • Not a Medical Device: This is a research prototype. It is not intended for and should not be used for making any form of clinical diagnosis or treatment decisions.
  • Potential for Hallucination: While trained to follow a strict format, the model may occasionally generate concepts or subthemes that are not in the official terminology list. All outputs should be reviewed by a human.
  • Language: The model was trained exclusively on English-language narratives.

Dependencies

To run this model, you will need the following Python libraries. You can install them with pip.

# requirements.txt

transformers==4.40.1
datasets==2.18.0
accelerate==0.29.3
peft==0.10.0
bitsandbytes==0.43.0
trl==0.8.6
torch
!pip uninstall -y sentence-transformers
!pip install torch==2.3.1+cu121 torchvision==0.18.1+cu121 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
!pip install -q "transformers==4.43.2" "datasets==2.18.0" "accelerate==0.29.3" "peft==0.10.0" "bitsandbytes==0.43.1" "trl==0.8.6" "protobuf==3.20.3"
!pip install -q einops scipy sentencepiece tensorboard

import os
os.kill(os.getpid(), 9)
Downloads last month
2
Safetensors
Model size
8B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Laxmikant17/Llama-3-8B-Hernia-Analyst-600-Patients

Finetuned
(847)
this model
Quantizations
1 model

Space using Laxmikant17/Llama-3-8B-Hernia-Analyst-600-Patients 1