Llama 3.1 8B Lvs_Latn
Language-enhanced LLaMA-3.1-8B model for Latvian using sparse subnetwork fine-tuning.
Method
- Training approach: Language-specific neuron identification + subnetwork fine-tuning
- Parameters trained: <1% of total model parameters
- Framework: Language Subnetwork Enhancement
Performance
Enhanced monolingual capabilities in Latvian while preserving multilingual performance.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("DGurgurov/llama-3.1-8b-lvs_latn")
tokenizer = AutoTokenizer.from_pretrained("DGurgurov/llama-3.1-8b-lvs_latn")
prompt = "Your Latvian prompt here"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0]))
Citation
@misc{gurgurov2025sparsesubnetworkenhancement,
    title={Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models}, 
    author={Daniil Gurgurov and Josef van Genabith and Simon Ostermann},
    year={2025},
    eprint={2510.13580},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2510.13580}
}
@misc{gurgurov2025languagearithmeticssystematiclanguage,
      title={Language Arithmetics: Towards Systematic Language Neuron Identification and Manipulation}, 
      author={Daniil Gurgurov and Katharina Trinley and Yusser Al Ghussin and Tanja Baeumel and Josef van Genabith and Simon Ostermann},
      year={2025},
      eprint={2507.22608},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2507.22608}, 
}
- Downloads last month
- 32
Model tree for DGurgurov/llama-3.1-8b-lvs_latn
Base model
meta-llama/Llama-3.1-8B