Update README.md

3ac5130 verified 4 months ago

7.54 kB

	---
	library_name: transformers
	license: apache-2.0
	language:
	- en
	base_model:
	- meta-llama/Meta-Llama-3-8B-Instruct
	---

	# Model Card for Model ID

	# Llama-3-8B-Hernia-Analyst-263-Patients

	This is a fine-tuned version of `meta-llama/Meta-Llama-3-8B-Instruct`, specialized for analyzing patient narratives related to Abdominal Wall Hernia (AWH).

	The model is designed to act as an "AI Research Assistant." It takes unstructured, free-text patient stories as input and transforms them into a structured JSON output. This output is based on a specific Quality of Life (QoL) framework derived from clinical research, notably the work published in Hernia (2022) 26:795–808.

	This model was trained as a proof-of-concept project on a dataset of 263 synthetic patients.

	## Model Description

	The primary goal of this model is to automate the time-consuming process of qualitative analysis. It identifies key patient-reported outcomes across five core domains:
	- Body Image
	- Mental Health
	- Symptoms and Function
	- Interpersonal Relationships
	- Employment

	For each domain, the model performs a multi-level analysis, identifying the presence of the theme, its sentiment, key illustrative quotes, and the specific subthemes and clinical concepts mentioned by the patient.

	## Intended Use

	This model is intended for research and prototyping purposes only. Its primary use is to process free-text patient narratives (e.g., from interview transcripts or questionnaires) and generate a structured, machine-readable analysis that can be used for cohort-level studies, data visualization, or to assist clinicians in quickly grasping the key QoL issues for a patient.

	This is not a medical device. The output should not be used for clinical diagnosis, treatment decisions, or any direct patient care without verification by a qualified healthcare professional.

	## How to Use

	The model requires the prompt to be formatted in the Llama 3 Instruct template. The following Python code shows how to load the model and run inference on a new patient narrative.

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import json

	# Your model's unique ID on the Hugging Face Hub
	model_name = "Laxmikant17/Llama-3-8B-Hernia-Analyst-600-Patients"

	print(f"Loading fine-tuned model: {model_name}")

	# For running on a smaller GPU, it's recommended to load in 4-bit
	# from transformers import BitsAndBytesConfig
	# bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16)
	# model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config, device_map="auto")

	# For running on a larger GPU or CPU
	model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model.eval()

	print("✅ Model loaded successfully!")

	# Prepare your test narrative
	test_narrative = """
	The pain is the worst part. It's a constant, burning sensation that gets worse when I stand for more than ten minutes. I can't even lift my grocery bags without feeling a sharp pull. I also feel deformed. I avoid looking at myself without a shirt on. I just want to feel normal again.
	"""

	# Format the prompt using the exact Llama 3 Instruct template
	instruction = "Analyze the provided patient narrative about their experience with an Abdominal Wall Hernia (AWH) and generate a structured JSON output that summarizes your findings, adhering to the specified format and terminology."
	prompt = f"<\|begin_of_text\|><\|start_header_id\|>user<\|end_header_id\|>\n\n{instruction}\n\nPatient Narrative:\n{test_narrative}<\|eot_id\|><\|start_header_id\|>assistant<\|end_header_id\|>\n\n"

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	# Generate the analysis
	print("\n🚀 Generating analysis...")
	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=4096,
	do_sample=False
	)

	decoded_output = tokenizer.decode(outputs, skip_special_tokens=True)

	# Robustly extract and print the JSON from the model's response
	try:
	assistant_response_start = decoded_output.find('assistant\n\n')
	response_part = decoded_output[assistant_response_start + len('assistant\n\n'):].strip()
	json_start = response_part.find('{')
	json_end = response_part.rfind('}') + 1
	json_string = response_part[json_start:json_end]

	print("\n--- ✅ MODEL-GENERATED ANALYSIS ---")
	parsed_json = json.loads(json_string)
	print(json.dumps(parsed_json, indent=2))
	except Exception as e:
	print(f"\n--- 🚨 ERROR: Could not parse the model's response. ---")
	print(f"Error: {e}")
	print("\nFull output for debugging:")
	print(decoded_output)
	```
	## Training Data

	The model was fine-tuned on a dataset of 263 synthetic patient profiles.

	The training data was generated via a two-step process:
	1. Narrative Generation: Patient profiles with specific metadata were created, and narratives were written to reflect the 5 key QoL domains.
	2. Analysis Generation: The target `output` JSON for each patient was generated by a powerful "teacher" model (`gemini-1.5-pro-latest`). This teacher model was guided by a highly detailed prompt that included the full QoL framework (domains, subthemes, and concepts) derived from the source research papers. This ensured the training data was high-quality, structured, and clinically relevant.

	## Training Procedure

	The model was fine-tuned using the QLoRA (Quantized Low-Rank Adaptation) technique for memory-efficient training.

	- Frameworks: `transformers`, `peft`, `bitsandbytes`, `trl`
	- Base Model: `meta-llama/Meta-Llama-3-8B-Instruct`
	- Hardware: Single NVIDIA T4 GPU via Google Colab.

	### Key Hyperparameters:
	- `learning_rate`: 2e-4
	- `lora_r` (rank): 8
	- `lora_alpha`: 16
	- `num_train_epochs`: 1
	- `per_device_train_batch_size`: 1
	- `gradient_accumulation_steps`: 8 (Effective batch size: 8)
	- `optimizer`: paged_adamw_8bit
	- `lr_scheduler_type`: cosine

	## Limitations and Bias

	- Synthetic Data Bias: The model has only ever been trained on synthetic data. Its understanding of patient language is limited by the style and content of these narratives. It has not been validated on real-world patient text and may not perform as well on transcripts with different dialects, slang, or clinical complexity.
	- Not a Medical Device: This is a research prototype. It is not intended for and should not be used for making any form of clinical diagnosis or treatment decisions.
	- Potential for Hallucination: While trained to follow a strict format, the model may occasionally generate concepts or subthemes that are not in the official terminology list. All outputs should be reviewed by a human.
	- Language: The model was trained exclusively on English-language narratives.

	## Dependencies

	To run this model, you will need the following Python libraries. You can install them with `pip`.

	```text
	# requirements.txt

	transformers==4.40.1
	datasets==2.18.0
	accelerate==0.29.3
	peft==0.10.0
	bitsandbytes==0.43.0
	trl==0.8.6
	torch
	```
	```python
	!pip uninstall -y sentence-transformers
	!pip install torch==2.3.1+cu121 torchvision==0.18.1+cu121 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
	!pip install -q "transformers==4.43.2" "datasets==2.18.0" "accelerate==0.29.3" "peft==0.10.0" "bitsandbytes==0.43.1" "trl==0.8.6" "protobuf==3.20.3"
	!pip install -q einops scipy sentencepiece tensorboard

	import os
	os.kill(os.getpid(), 9)
	```