EpiLLaMA-3.3-70B: Fine-tuned LLaMA for Epidemiological Information Extraction

Model Description

EpiLLaMA-3.3-70B is a fine-tuned version of meta-llama/Llama-3.3-70B-Instruct specialized for extracting structured epidemiological information from unstructured disease outbreak reports. The model was trained on the WHO Disease Outbreak News (DONs) curated database (Carlson et al., 2023) to automatically extract key epidemiological features including disease classification, geographical locations, case counts, temporal information, and outbreak characteristics.

Model Details

  • Base Model: meta-llama/Llama-3.3-70B-Instruct
  • Base Model License: LLaMA 3.3 Community License Agreement
  • Model Type: Causal Language Model (Decoder-only Transformer)
  • Fine-tuning Method: Parameter-Efficient Fine-Tuning (PEFT) with LoRA (Low-Rank Adaptation)
  • Adapter Weights License: CC0-1.0 (Public Domain Dedication) - Note: Only the LoRA adapter weights are released under CC0. The base model weights remain under the LLaMA 3.3 Community License.
  • Training Data: WHO Disease Outbreak News curated database (3,338 records through 2019)
  • Language: English
  • Application Domain: Public health surveillance, epidemic intelligence, epidemiological information extraction

License

Important Licensing Information

This repository contains LoRA adapter weights only, not the full model weights.

  • Base Model (LLaMA 3.3 70B): Licensed under the LLaMA 3.3 Community License Agreement

    • Copyright ยฉ Meta Platforms, Inc. All Rights Reserved.
    • Users must comply with the LLaMA 3.3 Community License to use the base model
    • Acceptable Use Policy and other restrictions apply
  • LoRA Adapter Weights: Released under CC0 1.0 Universal (Public Domain Dedication)

    • The adapter weights can be used without restriction
    • However, to use these adapters, you must have access to and comply with the license of the base LLaMA 3.3 70B model

Attribution Required: When using this model, please include the following notice:

LLaMA 3.3 is licensed under the LLaMA 3.3 Community License,
Copyright ยฉ Meta Platforms, Inc. All Rights Reserved.

EpiLLaMA-3.3-70B LoRA adapter weights are released under CC0 1.0 Universal (Public Domain).

Distribution Notes

  • This repository distributes only the fine-tuned LoRA adapter parameters
  • Base model weights are unchanged and must be obtained separately from Meta/Hugging Face
  • Users must agree to Meta's LLaMA 3.3 Community License to use the complete model
  • The LoRA adapters are applied on top of the base model weights at inference time

Performance

The model achieved the following results on the evaluation set:

Metric Score
Rouge-1 0.937 ยฑ 0.046
Rouge-2 0.896 ยฑ 0.058
Rouge-L 0.928 ยฑ 0.047
Rouge-Lsum 0.929 ยฑ 0.049

These scores represent overall performance across 5-fold stratified cross-validation, demonstrating very high accuracy in extracting structured epidemiological information from unstructured outbreak reports.

Training Summary

  • Best Training Step: 12,810
  • Best Training Loss: 0.0095
  • Total Training Steps: 13,325
  • Final Training Loss: 0.0105
  • Total Improvement: 1.6752 (from initial loss of 1.6847)

Intended Uses & Limitations

Intended Uses

This model is designed for:

  • Automated extraction of epidemiological information from disease outbreak reports
  • Public health surveillance systems requiring structured data from unstructured sources
  • Epidemic intelligence pipelines for rapid outbreak detection and monitoring
  • Research purposes in computational epidemiology and public health informatics

Limitations

  • The model is trained specifically on WHO DONs format and may require adaptation for other report formats
  • Performance on diseases not well-represented in the training data may vary
  • The model extracts information present in the text and does not generate or infer missing data
  • Designed for English-language outbreak reports only
  • Should be used as a decision-support tool, with human verification for critical public health decisions

Extracted Features

The model extracts the following structured epidemiological information:

Disease Information:

  • DiseaseLevel1 (primary disease classification)
  • DiseaseLevel2 (disease subtype/variant)

Geographical Information:

  • Country
  • ISO country code
  • OutbreakEpicenter (specific location within country)

Case Counts:

  • CasesTotal
  • CasesSuspected
  • CasesProbable
  • CasesConfirmed
  • Deaths

Temporal Information:

  • Outbreak start date (year, month, day)
  • Outbreak detection date (year, month, day)
  • Outbreak verification date (year, month, day)
  • Outbreak end date and status

Training Procedure

Training Data

The model was trained on the WHO Disease Outbreak News curated database (Carlson et al., 2023), which contains:

  • 3,338 structured records of disease outbreaks (data through 2019)
  • Curated epidemiological information manually extracted from WHO DONs reports
  • Standardized format for disease classifications, geographical locations, case counts, and temporal data

Training Approach

The training followed an instruction-tuning paradigm where unstructured outbreak report text is paired with structured JSON output containing extracted epidemiological features. The prompt format used was:

Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request.

### Instruction:
Extract disease outbreak information from the given text and format it as JSON.
Return a list containing one JSON object per outbreak mentioned.
Use "None" for missing information. Never invent or guess data.

### Input:
[Outbreak report text]

### Response:
[Extracted JSON with epidemiological features]

Fine-tuning Configuration

LoRA (Low-Rank Adaptation) Parameters:

  • Rank (r): 16
  • Alpha (ฮฑ): 16
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Dropout: 0.05
  • Task type: CAUSAL_LM

Training Hyperparameters:

  • Learning rate: 1e-5
  • Optimizer: AdamW (8-bit paged)
  • Training batch size: 4 per device (8 GPUs)
  • Gradient accumulation steps: 4
  • Number of epochs: 5
  • Warmup steps: Adaptive (10% of training steps, max 10)
  • FP16 mixed precision training
  • Weight decay: 0.01
  • LR scheduler: Linear
  • Seed: 41

Evaluation Strategy:

  • 5-fold stratified cross-validation
  • Evaluation metric: Training loss (model selection based on lowest training loss)
  • Early stopping: After 6 consecutive evaluations without improvement
  • Logging steps: 10
  • Save steps: Adaptive (10% of training steps)

Hardware:

  • Infrastructure: JRC Big Data Analytics Platform
  • System: Linux cluster, Ubuntu 22.04.5 LTS
  • CPU: Intel Xeon Platinum 8470 (208 CPUs)
  • RAM: 1TB
  • GPUs: 8x NVIDIA H100
  • Training time: ~30 hours per fold

Quantization

The model uses 8-bit quantization with LoRA during training:

  • Load in 8-bit: True
  • Quantization type: Standard 8-bit
  • Compute dtype: bfloat16

Usage

Installation

pip install transformers==4.52.4
pip install torch==2.3.1
pip install peft==0.12.0
pip install accelerate==1.7.0
pip install bitsandbytes==0.43.3

Basic Usage

Important: You must have access to the base LLaMA 3.3 70B model and accept Meta's license terms before using these adapter weights.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load base model (requires LLaMA 3.3 license acceptance)
base_model_id = "meta-llama/Llama-3.3-70B-Instruct"
adapter_model_id = "jrc-ai/EpiLLaMA-3.3-70B"  # LoRA adapters
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load tokenizer from base model
tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16,
)

# Load and apply LoRA adapters
model = PeftModel.from_pretrained(base_model, adapter_model_id)

# Example outbreak report
outbreak_text = """
WHO has reported 3 suspected cases of yellow fever in Maryland county, 
in the south-eastern part of the country. One case with disease onset on 
1 August has been confirmed (IgM positive) by the Institut Pasteur in 
Abidjan, Cรดte d'Ivoire. All three cases have died.
"""

# Format prompt
prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Extract disease outbreak information from the given text and format it as JSON.
Return a list containing one JSON object per outbreak mentioned.
Always return a list of JSON objects, even for single outbreaks.
Use "None" for missing information. If no outbreak information is found, return an empty list [].
Never invent or guess data.

### Input:
{outbreak_text}

### Response:
"""

# Tokenize and generate
inputs = tokenizer(prompt, return_tensors="pt", truncation=True).to(device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=600,
        temperature=0.1,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )

# Decode output
extracted_info = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(extracted_info)

Expected Output Format

[{
  "DiseaseLevel1": "Yellow fever",
  "DiseaseLevel2": "",
  "Country": "Liberia",
  "ISO": "LBR",
  "OutbreakEpicenter": "Maryland county",
  "CasesTotal": 3,
  "CasesSuspected": 2,
  "CasesProbable": null,
  "CasesConfirmed": 1,
  "Deaths": 3,
  "OutbreakStartYear": 2001,
  "OutbreakStartMonth": 8,
  "OutbreakStartDay": 1,
  "OutbreakDetectionYear": null,
  "OutbreakDetectionMonth": null,
  "OutbreakDetectionDay": null,
  "OutbreakVerificationYear": null,
  "OutbreakVerificationMonth": null,
  "OutbreakVerificationDay": null,
  "OutbreakEnd": null,
  "OutbreakEndYear": null,
  "OutbreakEndMonth": null,
  "OutbreakEndDay": null
}]

Comparison with Other Approaches

In-Context Learning vs Fine-Tuning

This fine-tuned model significantly outperforms in-context learning (iCL) approaches:

Approach Rouge-1 Rouge-2 Rouge-L Rouge-Lsum
EpiLLaMA 3.3-70B (fine-tuned) 0.937 0.896 0.928 0.929
LLaMA 3.3-70B (16-shot iCL) 0.840 0.698 0.824 0.841
Qwen 2.5-7B (16-shot iCL) 0.819 0.682 0.801 0.819

Performance gain from fine-tuning: ~10 percentage points across all ROUGE metrics.

Comparison with Smaller Models

Model Parameters Rouge-1 Rouge-2 Rouge-L
EpiLLaMA 3.3-70B 70B 0.937 0.896 0.928
EpiQwen 2.5-7B 7B 0.918 0.864 0.908
EpiMistral-7B 7B 0.899 0.853 0.889

All pairwise comparisons are statistically significant (p < 0.001, Nemenyi post-hoc test with Bonferroni correction).

Citation

If you use this model in your research, please cite:

@article{consoli2025generative,
  title={Generative AI for Structured Epidemiological Information Extraction: Comparing In-Context Learning and Fine-Tuning Approaches},
  author={Consoli, Sergio and Bertolini, Lorenzo and Stefanovitch, Nicolas and Spagnolo, Luigi and Espinosa, Laura and Stilianakis, Nikolaos I.},
  journal={Epidemiology and Infection},
  volume={submitted, currently under revision},
  year={2025},
  publisher={Cambridge University Press}
}

Please also acknowledge the base model:

@article{llama3.3,
  title={The Llama 3 Herd of Models},
  author={Meta AI},
  year={2024},
  url={https://ai.meta.com/research/publications/the-llama-3-herd-of-models/}
}

Ethical Considerations & Dual-Use Implications

Upon evaluation, we identified no dual-use implications for this model. The model is designed specifically for public health surveillance and epidemic intelligence applications to support global health initiatives.

Important Notes:

  • The model should be used as a decision-support tool with appropriate human oversight
  • Extracted information should be verified by public health professionals before making critical decisions
  • The model does not replace human expertise in epidemiological analysis
  • Privacy and data protection regulations should be followed when processing outbreak reports
  • Users must review and comply with Meta's Acceptable Use Policy included in the LLaMA 3.3 Community License

Acknowledgments

We acknowledge:

  • Meta Platforms, Inc. for developing and releasing LLaMA 3.3 70B under the LLaMA 3.3 Community License
  • The GPT@JRC initiative for providing access to LLMs
  • The JRC Big Data Analytics Platform for computational infrastructure
  • The WHO Epidemic Intelligence from Open Sources (EIOS) initiative for support
  • Colleagues at the European Commission Joint Research Centre (JRC) and the European Centre for Disease Prevention and Control (ECDC)

Framework Versions

  • Transformers: 4.52.4
  • PyTorch: 2.3.1
  • PEFT: 0.12.0
  • Accelerate: 1.7.0
  • BitsAndBytes: 0.43.3
  • Datasets: 2.20.0

Disclaimer: The views expressed are purely those of the authors and may not in any circumstance be regarded as stating an official position of the European Commission.

Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AI4PH/EpiLLaMA-3.3-70B

Finetuned
(222)
this model