EpiLLaMA-3.3-70B: Fine-tuned LLaMA for Epidemiological Information Extraction
Model Description
EpiLLaMA-3.3-70B is a fine-tuned version of meta-llama/Llama-3.3-70B-Instruct specialized for extracting structured epidemiological information from unstructured disease outbreak reports. The model was trained on the WHO Disease Outbreak News (DONs) curated database (Carlson et al., 2023) to automatically extract key epidemiological features including disease classification, geographical locations, case counts, temporal information, and outbreak characteristics.
Model Details
- Base Model: meta-llama/Llama-3.3-70B-Instruct
- Base Model License: LLaMA 3.3 Community License Agreement
- Model Type: Causal Language Model (Decoder-only Transformer)
- Fine-tuning Method: Parameter-Efficient Fine-Tuning (PEFT) with LoRA (Low-Rank Adaptation)
- Adapter Weights License: CC0-1.0 (Public Domain Dedication) - Note: Only the LoRA adapter weights are released under CC0. The base model weights remain under the LLaMA 3.3 Community License.
- Training Data: WHO Disease Outbreak News curated database (3,338 records through 2019)
- Language: English
- Application Domain: Public health surveillance, epidemic intelligence, epidemiological information extraction
License
Important Licensing Information
This repository contains LoRA adapter weights only, not the full model weights.
Base Model (LLaMA 3.3 70B): Licensed under the LLaMA 3.3 Community License Agreement
- Copyright ยฉ Meta Platforms, Inc. All Rights Reserved.
- Users must comply with the LLaMA 3.3 Community License to use the base model
- Acceptable Use Policy and other restrictions apply
LoRA Adapter Weights: Released under CC0 1.0 Universal (Public Domain Dedication)
- The adapter weights can be used without restriction
- However, to use these adapters, you must have access to and comply with the license of the base LLaMA 3.3 70B model
Attribution Required: When using this model, please include the following notice:
LLaMA 3.3 is licensed under the LLaMA 3.3 Community License,
Copyright ยฉ Meta Platforms, Inc. All Rights Reserved.
EpiLLaMA-3.3-70B LoRA adapter weights are released under CC0 1.0 Universal (Public Domain).
Distribution Notes
- This repository distributes only the fine-tuned LoRA adapter parameters
- Base model weights are unchanged and must be obtained separately from Meta/Hugging Face
- Users must agree to Meta's LLaMA 3.3 Community License to use the complete model
- The LoRA adapters are applied on top of the base model weights at inference time
Performance
The model achieved the following results on the evaluation set:
| Metric | Score |
|---|---|
| Rouge-1 | 0.937 ยฑ 0.046 |
| Rouge-2 | 0.896 ยฑ 0.058 |
| Rouge-L | 0.928 ยฑ 0.047 |
| Rouge-Lsum | 0.929 ยฑ 0.049 |
These scores represent overall performance across 5-fold stratified cross-validation, demonstrating very high accuracy in extracting structured epidemiological information from unstructured outbreak reports.
Training Summary
- Best Training Step: 12,810
- Best Training Loss: 0.0095
- Total Training Steps: 13,325
- Final Training Loss: 0.0105
- Total Improvement: 1.6752 (from initial loss of 1.6847)
Intended Uses & Limitations
Intended Uses
This model is designed for:
- Automated extraction of epidemiological information from disease outbreak reports
- Public health surveillance systems requiring structured data from unstructured sources
- Epidemic intelligence pipelines for rapid outbreak detection and monitoring
- Research purposes in computational epidemiology and public health informatics
Limitations
- The model is trained specifically on WHO DONs format and may require adaptation for other report formats
- Performance on diseases not well-represented in the training data may vary
- The model extracts information present in the text and does not generate or infer missing data
- Designed for English-language outbreak reports only
- Should be used as a decision-support tool, with human verification for critical public health decisions
Extracted Features
The model extracts the following structured epidemiological information:
Disease Information:
- DiseaseLevel1 (primary disease classification)
- DiseaseLevel2 (disease subtype/variant)
Geographical Information:
- Country
- ISO country code
- OutbreakEpicenter (specific location within country)
Case Counts:
- CasesTotal
- CasesSuspected
- CasesProbable
- CasesConfirmed
- Deaths
Temporal Information:
- Outbreak start date (year, month, day)
- Outbreak detection date (year, month, day)
- Outbreak verification date (year, month, day)
- Outbreak end date and status
Training Procedure
Training Data
The model was trained on the WHO Disease Outbreak News curated database (Carlson et al., 2023), which contains:
- 3,338 structured records of disease outbreaks (data through 2019)
- Curated epidemiological information manually extracted from WHO DONs reports
- Standardized format for disease classifications, geographical locations, case counts, and temporal data
Training Approach
The training followed an instruction-tuning paradigm where unstructured outbreak report text is paired with structured JSON output containing extracted epidemiological features. The prompt format used was:
Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
### Instruction:
Extract disease outbreak information from the given text and format it as JSON.
Return a list containing one JSON object per outbreak mentioned.
Use "None" for missing information. Never invent or guess data.
### Input:
[Outbreak report text]
### Response:
[Extracted JSON with epidemiological features]
Fine-tuning Configuration
LoRA (Low-Rank Adaptation) Parameters:
- Rank (r): 16
- Alpha (ฮฑ): 16
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Dropout: 0.05
- Task type: CAUSAL_LM
Training Hyperparameters:
- Learning rate: 1e-5
- Optimizer: AdamW (8-bit paged)
- Training batch size: 4 per device (8 GPUs)
- Gradient accumulation steps: 4
- Number of epochs: 5
- Warmup steps: Adaptive (10% of training steps, max 10)
- FP16 mixed precision training
- Weight decay: 0.01
- LR scheduler: Linear
- Seed: 41
Evaluation Strategy:
- 5-fold stratified cross-validation
- Evaluation metric: Training loss (model selection based on lowest training loss)
- Early stopping: After 6 consecutive evaluations without improvement
- Logging steps: 10
- Save steps: Adaptive (10% of training steps)
Hardware:
- Infrastructure: JRC Big Data Analytics Platform
- System: Linux cluster, Ubuntu 22.04.5 LTS
- CPU: Intel Xeon Platinum 8470 (208 CPUs)
- RAM: 1TB
- GPUs: 8x NVIDIA H100
- Training time: ~30 hours per fold
Quantization
The model uses 8-bit quantization with LoRA during training:
- Load in 8-bit: True
- Quantization type: Standard 8-bit
- Compute dtype: bfloat16
Usage
Installation
pip install transformers==4.52.4
pip install torch==2.3.1
pip install peft==0.12.0
pip install accelerate==1.7.0
pip install bitsandbytes==0.43.3
Basic Usage
Important: You must have access to the base LLaMA 3.3 70B model and accept Meta's license terms before using these adapter weights.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load base model (requires LLaMA 3.3 license acceptance)
base_model_id = "meta-llama/Llama-3.3-70B-Instruct"
adapter_model_id = "jrc-ai/EpiLLaMA-3.3-70B" # LoRA adapters
device = "cuda" if torch.cuda.is_available() else "cpu"
# Load tokenizer from base model
tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
device_map="auto",
torch_dtype=torch.bfloat16,
)
# Load and apply LoRA adapters
model = PeftModel.from_pretrained(base_model, adapter_model_id)
# Example outbreak report
outbreak_text = """
WHO has reported 3 suspected cases of yellow fever in Maryland county,
in the south-eastern part of the country. One case with disease onset on
1 August has been confirmed (IgM positive) by the Institut Pasteur in
Abidjan, Cรดte d'Ivoire. All three cases have died.
"""
# Format prompt
prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
Extract disease outbreak information from the given text and format it as JSON.
Return a list containing one JSON object per outbreak mentioned.
Always return a list of JSON objects, even for single outbreaks.
Use "None" for missing information. If no outbreak information is found, return an empty list [].
Never invent or guess data.
### Input:
{outbreak_text}
### Response:
"""
# Tokenize and generate
inputs = tokenizer(prompt, return_tensors="pt", truncation=True).to(device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=600,
temperature=0.1,
do_sample=False,
pad_token_id=tokenizer.eos_token_id
)
# Decode output
extracted_info = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(extracted_info)
Expected Output Format
[{
"DiseaseLevel1": "Yellow fever",
"DiseaseLevel2": "",
"Country": "Liberia",
"ISO": "LBR",
"OutbreakEpicenter": "Maryland county",
"CasesTotal": 3,
"CasesSuspected": 2,
"CasesProbable": null,
"CasesConfirmed": 1,
"Deaths": 3,
"OutbreakStartYear": 2001,
"OutbreakStartMonth": 8,
"OutbreakStartDay": 1,
"OutbreakDetectionYear": null,
"OutbreakDetectionMonth": null,
"OutbreakDetectionDay": null,
"OutbreakVerificationYear": null,
"OutbreakVerificationMonth": null,
"OutbreakVerificationDay": null,
"OutbreakEnd": null,
"OutbreakEndYear": null,
"OutbreakEndMonth": null,
"OutbreakEndDay": null
}]
Comparison with Other Approaches
In-Context Learning vs Fine-Tuning
This fine-tuned model significantly outperforms in-context learning (iCL) approaches:
| Approach | Rouge-1 | Rouge-2 | Rouge-L | Rouge-Lsum |
|---|---|---|---|---|
| EpiLLaMA 3.3-70B (fine-tuned) | 0.937 | 0.896 | 0.928 | 0.929 |
| LLaMA 3.3-70B (16-shot iCL) | 0.840 | 0.698 | 0.824 | 0.841 |
| Qwen 2.5-7B (16-shot iCL) | 0.819 | 0.682 | 0.801 | 0.819 |
Performance gain from fine-tuning: ~10 percentage points across all ROUGE metrics.
Comparison with Smaller Models
| Model | Parameters | Rouge-1 | Rouge-2 | Rouge-L |
|---|---|---|---|---|
| EpiLLaMA 3.3-70B | 70B | 0.937 | 0.896 | 0.928 |
| EpiQwen 2.5-7B | 7B | 0.918 | 0.864 | 0.908 |
| EpiMistral-7B | 7B | 0.899 | 0.853 | 0.889 |
All pairwise comparisons are statistically significant (p < 0.001, Nemenyi post-hoc test with Bonferroni correction).
Citation
If you use this model in your research, please cite:
@article{consoli2025generative,
title={Generative AI for Structured Epidemiological Information Extraction: Comparing In-Context Learning and Fine-Tuning Approaches},
author={Consoli, Sergio and Bertolini, Lorenzo and Stefanovitch, Nicolas and Spagnolo, Luigi and Espinosa, Laura and Stilianakis, Nikolaos I.},
journal={Epidemiology and Infection},
volume={submitted, currently under revision},
year={2025},
publisher={Cambridge University Press}
}
Please also acknowledge the base model:
@article{llama3.3,
title={The Llama 3 Herd of Models},
author={Meta AI},
year={2024},
url={https://ai.meta.com/research/publications/the-llama-3-herd-of-models/}
}
Ethical Considerations & Dual-Use Implications
Upon evaluation, we identified no dual-use implications for this model. The model is designed specifically for public health surveillance and epidemic intelligence applications to support global health initiatives.
Important Notes:
- The model should be used as a decision-support tool with appropriate human oversight
- Extracted information should be verified by public health professionals before making critical decisions
- The model does not replace human expertise in epidemiological analysis
- Privacy and data protection regulations should be followed when processing outbreak reports
- Users must review and comply with Meta's Acceptable Use Policy included in the LLaMA 3.3 Community License
Acknowledgments
We acknowledge:
- Meta Platforms, Inc. for developing and releasing LLaMA 3.3 70B under the LLaMA 3.3 Community License
- The GPT@JRC initiative for providing access to LLMs
- The JRC Big Data Analytics Platform for computational infrastructure
- The WHO Epidemic Intelligence from Open Sources (EIOS) initiative for support
- Colleagues at the European Commission Joint Research Centre (JRC) and the European Centre for Disease Prevention and Control (ECDC)
Framework Versions
- Transformers: 4.52.4
- PyTorch: 2.3.1
- PEFT: 0.12.0
- Accelerate: 1.7.0
- BitsAndBytes: 0.43.3
- Datasets: 2.20.0
Disclaimer: The views expressed are purely those of the authors and may not in any circumstance be regarded as stating an official position of the European Commission.
- Downloads last month
- 6
Model tree for AI4PH/EpiLLaMA-3.3-70B
Base model
meta-llama/Llama-3.1-70B