--- version: main family: smollm2-360m model_name: smollm2-360m-score0_only-300B-mbs16-gbs1024-16feb-lr2e-05-gbs16 license: mit tags: - model - transformer - smollm2 --- # SmolLM2 smollm2-360m-score0_only-300B-mbs16-gbs1024-16feb-lr2e-05-gbs16 (Version: main) ## Model Details - **Architecture:** SmolLM2 - **Parameters:** 360M ## Training Configuration ```yaml eval: final_validation: false initial_validation: false interval: 10000 max_iters: 100 optimizer: class_path: torch.optim.AdamW init_args: betas: - 0.9 - 0.95 lr: 0.003 weight_decay: 0.033 precision: bf16-mixed seed: 42 train: global_batch_size: 1024 log_interval: 10 lr_warmup_steps: 2000 max_norm: 1.0 max_seq_length: 2048 max_tokens: 300000000000 micro_batch_size: 16 min_lr: 0 save_interval: 30000 tie_embeddings: false ``` ## Model Loading and Revision System This repository hosts multiple revisions of the model. To load a specific revision, use the `revision` parameter. For example: ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("locuslab/mix_ift_v3-smollm2-360m-smollm2-360m-score0_only", revision="final") tokenizer = AutoTokenizer.from_pretrained("locuslab/mix_ift_v3-smollm2-360m-smollm2-360m-score0_only", revision="final") ``` Replace `"final"` with the desired revision.