

Model Card
This document describes a parameter-efficient fine-tuning setup using LoRA on the EuroHPC Karolina system. Axolotl provides flexible orchestration and Unsloth supplies optimized kernels for high-throughput training on the EuroHPC-Legal dataset. This model is fine-tuned using Parameter-Efficient Fine-Tuning (PEFT) with LoRA (Low-Rank Adaptation) on the EuroHPC dataset, specifically the intellectual subset. The fine-tuning leverages the Axolotl framework for orchestration and Unsloth library for optimized training kernels.
Hyperparameters
- LoRA Rank: 16
- LoRA Alpha: 32
- LoRA Dropout: 0.05
- Learning Rate: 3×10⁻⁵ with cosine scheduling
- Training Epochs: 3 per domain
- Batch Size: Optimized for A100 memory capacity
Architecture
- Base Model: Llama-3.1-8B-Instruct (Meta)
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Parameter Efficiency: Only trainable LoRA parameters, frozen base model
- Model Size: 8B parameters (base) + LoRA adapters
Hardware and Software
- Orchestration: Axolotl framework
- Acceleration: Unsloth library
- Backend: PyTorch with CUDA support
- System: EuroHPC Karolina supercomputer
- GPUs: NVIDIA A100 (8 × 40 GB per node, 320 GB HBM2 total)
- Utilization: 85–90% GPU and memory efficiency
- Total Compute: ~600 GPU hours
Data
Input Format
The dataset follows the Chat Template format with conversation structure:
{
"messages": [
{
"role": "system",
"content": "You are a helpful assistant specialized in Turkish legal matters."
},
{
"role": "user",
"content": "Question or task description"
},
{
"role": "assistant",
"content": "Response or answer"
}
]
}
Dataset: newmindai/EuroHPC-Legal (intellectual subset)
How to Use
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "newmindai/Llama-3.1-8B-Instruct-intellectual-chat-template")
# Use chat template for formatting
messages = [
{"role": "system", "content": "You are a helpful assistant specialized in Turkish legal matters."},
{"role": "user", "content": "Explain the benefits of regular exercise"}
]
# Apply chat template
tokenizer.chat_template = tokenizer.default_chat_template
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# Example usage
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
Acknowledgments
This research was supported by the EuroHPC Joint Undertaking (EuroHPC JU) under the Benchmark Access grant agreement No EHPC-BEN-2024B11-003. The authors gratefully acknowledge the computational resources provided by the IT4Innovations National Supercomputing Center (Czech Republic) on the Karolina supercomputer, made available through the EuroHPC JU.
Citation
@article{newmind2025,
title={Tailoring AI for Turkish Law: Domain-Specific Fine-Tuning of Small Language Models for Legal Expertise},
author={New Mind AI Team},
journal={Procedia Computer Science},
year={2025},
volume={239},
doi={10.1016/j.procs.2025.08.239},
note={Available online 23 September 2025, Version of Record 23 September 2025}
}
- Downloads last month
- 3
Model tree for newmindai/Llama-3.1-8B-Instruct-intellectual-chat-template
Base model
meta-llama/Llama-3.1-8B