Modified SmolLM2 with Bangla Tokenizer Support

This is a modified version of SmolLM2-135M that includes enhanced Bangla (Bengali) tokenizer support by merging tokens from TituLM.

Model Details

  • Base Model: HuggingFaceTB/SmolLM2-135M
  • Tokenizer Enhancement: Merged with TituLM Bangla tokenizer
  • Original Vocabulary Size: 49,152
  • Enhanced Vocabulary Size: 180,177
  • Added Tokens: ~131,025 Bangla-specific tokens

Key Features

  • ✅ Full SmolLM2-135M model architecture
  • ✅ Enhanced Bangla tokenization support
  • ✅ Backward compatible with original SmolLM2
  • ✅ Improved performance on Bangla text

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the modified model
model = AutoModelForCausalLM.from_pretrained("rnnandi/modified_smollm")
tokenizer = AutoTokenizer.from_pretrained("rnnandi/modified_smollm")

# Test with Bangla text
text = "আমি বাংলায় গান গাই"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)

Training

This model was created by:

  1. Merging TituLM Bangla tokenizer with SmolLM2 tokenizer
  2. Resizing model embeddings to accommodate new vocabulary
  3. Preserving original model weights and architecture

Citation

If you use this model, please cite both the original SmolLM2 and TituLM:

@misc{smollm2,
  title={SmolLM2: A Family of Small Language Models},
  author={HuggingFace Team},
  year={2024},
  url={https://huggingface.co/HuggingFaceTB/SmolLM2-135M}
}

@misc{titulm,
  title={TituLM: A Bangla Language Model},
  author={Hishab Team},
  year={2024},
  url={https://huggingface.co/hishab/titulm-llama-3.2-1b-v2.0}
}

License

This model is released under the Apache 2.0 License, same as the base SmolLM2 model.

Downloads last month
7
Safetensors
Model size
0.2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rnnandi/modified_smollm

Finetuned
(764)
this model