--- language: - ar base_model: - LiquidAI/LFM2-1.2B license: cc-by-nc-4.0 --- # Hala-1.2B-EN-AR-Translator

Hala logo

**Paper**: *Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale* **Authors**: Hasan Abed Al Kader Hammoud\*, Mohammad Zbeeb\*, Bernard Ghanem **Affiliation**: King Abdullah University of Science and Technology (KAUST) \*Equal contribution --- ## πŸ“– Overview The **Hala-1.2B-EN-AR-Translator** is a lightweight translation model fine-tuned for **English β†’ Arabic** translation, particularly in **instruction-style and conversational contexts**. It powers the creation of the **Hala dataset** and can also be used as a standalone translator for research, dataset generation, or preprocessing tasks. --- ## πŸ”§ Usage Example ```python from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline model_id = "hammh0a/Hala-1.2B-EN-AR-Translator" tok = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype="auto", device_map="auto" ) pipe = pipeline("text-generation", model=model, tokenizer=tok) # Example English text text = "Physics is the study of matter, energy, and the interactions between them." messages = [ { "role": "user", "content": "Translate everything that follows into Arabic:\n\n" + text, } ] prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) out = pipe(prompt, max_new_tokens=256, do_sample=False) print(out[0]["generated_text"]) ``` --- ## ENβ†’AR Translation Quality on 500 Sampled MMLU Questions | **System** | **BLEU ↑** | **ROUGE-L ↑** | **chrF++ ↑** | |------------|------------|---------------|--------------| | *Teacher translator* | | | | | CohereLabs/command-a-translate-08-2025 (FP16) | 53.1 | 26.0 | 68.6 | | **hammh0a/command-a-translate-FP8-Dynamic** | 53.5 (+0.3) | 26.0 (+0.0) | 68.9 (+0.3) | | *Lightweight translator (LFM2-1.2B family)* | | | | | LiquidAI/LFM2-1.2B (base) | 16.0 | 19.3 | 43.2 | | **Our LFM2-1.2B Translator (ours)** | 48.2 (+32.1) | 25.1 (+5.9) | 64.2 (+21.0) | --- ## πŸ“š Citation If you use **Hala-1.2B-EN-AR-Translator**, please cite: Link: https://arxiv.org/abs/2509.14008 ```bibtex @misc{hammoud2025halatechnicalreportbuilding, title={Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale}, author={Hasan Abed Al Kader Hammoud and Mohammad Zbeeb and Bernard Ghanem}, year={2025}, url={https://arxiv.org/abs/2509.14008}, } ```