Llama 3.1 8B Lvs_Latn
Language-enhanced LLaMA-3.1-8B model for Latvian using sparse subnetwork fine-tuning.
Method
- Training approach: Language-specific neuron identification + subnetwork fine-tuning
- Parameters trained: <1% of total model parameters
- Framework: Language Subnetwork Enhancement
Performance
Enhanced monolingual capabilities in Latvian while preserving multilingual performance.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("DGurgurov/llama-3.1-8b-lvs_latn")
tokenizer = AutoTokenizer.from_pretrained("DGurgurov/llama-3.1-8b-lvs_latn")
prompt = "Your Latvian prompt here"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0]))
Citation
@misc{gurgurov2025sparsesubnetworkenhancement,
title={Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models},
author={Daniil Gurgurov and Josef van Genabith and Simon Ostermann},
year={2025},
eprint={2510.13580},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.13580}
}
@misc{gurgurov2025languagearithmeticssystematiclanguage,
title={Language Arithmetics: Towards Systematic Language Neuron Identification and Manipulation},
author={Daniil Gurgurov and Katharina Trinley and Yusser Al Ghussin and Tanja Baeumel and Josef van Genabith and Simon Ostermann},
year={2025},
eprint={2507.22608},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2507.22608},
}
- Downloads last month
- 32
Model tree for DGurgurov/llama-3.1-8b-lvs_latn
Base model
meta-llama/Llama-3.1-8B