Llama 3.1 8B Lvs_Latn

Language-enhanced LLaMA-3.1-8B model for Latvian using sparse subnetwork fine-tuning.

Method

  • Training approach: Language-specific neuron identification + subnetwork fine-tuning
  • Parameters trained: <1% of total model parameters
  • Framework: Language Subnetwork Enhancement

Performance

Enhanced monolingual capabilities in Latvian while preserving multilingual performance.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("DGurgurov/llama-3.1-8b-lvs_latn")
tokenizer = AutoTokenizer.from_pretrained("DGurgurov/llama-3.1-8b-lvs_latn")

prompt = "Your Latvian prompt here"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0]))

Citation

@misc{gurgurov2025sparsesubnetworkenhancement,
    title={Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models}, 
    author={Daniil Gurgurov and Josef van Genabith and Simon Ostermann},
    year={2025},
    eprint={2510.13580},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2510.13580}
}

@misc{gurgurov2025languagearithmeticssystematiclanguage,
      title={Language Arithmetics: Towards Systematic Language Neuron Identification and Manipulation}, 
      author={Daniil Gurgurov and Katharina Trinley and Yusser Al Ghussin and Tanja Baeumel and Josef van Genabith and Simon Ostermann},
      year={2025},
      eprint={2507.22608},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2507.22608}, 
}
Downloads last month
32
Safetensors
Model size
8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DGurgurov/llama-3.1-8b-lvs_latn

Finetuned
(1602)
this model