File size: 1,396 Bytes
da8889f 9bbd3e3 da8889f 9bbd3e3 da8889f 9bbd3e3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
version: main
family: smollm2-360m
model_name: smollm2-360m-score0_only-300B-mbs16-gbs1024-16feb-lr2e-05-gbs16
license: mit
tags:
- model
- transformer
- smollm2
---
# SmolLM2 smollm2-360m-score0_only-300B-mbs16-gbs1024-16feb-lr2e-05-gbs16 (Version: main)
## Model Details
- **Architecture:** SmolLM2
- **Parameters:** 360M
## Training Configuration
```yaml
eval:
final_validation: false
initial_validation: false
interval: 10000
max_iters: 100
optimizer:
class_path: torch.optim.AdamW
init_args:
betas:
- 0.9
- 0.95
lr: 0.003
weight_decay: 0.033
precision: bf16-mixed
seed: 42
train:
global_batch_size: 1024
log_interval: 10
lr_warmup_steps: 2000
max_norm: 1.0
max_seq_length: 2048
max_tokens: 300000000000
micro_batch_size: 16
min_lr: 0
save_interval: 30000
tie_embeddings: false
```
## Model Loading and Revision System
This repository hosts multiple revisions of the model.
To load a specific revision, use the `revision` parameter. For example:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("locuslab/mix_ift_v2-smollm2-360m-smollm2-360m-score0_only-300B", revision="final")
tokenizer = AutoTokenizer.from_pretrained("locuslab/mix_ift_v2-smollm2-360m-smollm2-360m-score0_only-300B", revision="final")
```
Replace `"final"` with the desired revision.
|