twscrape-prepared-regression-NeoBERT-3epochs

This model is a fine-tuned version of chandar-lab/NeoBERT on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5382
  • Mse: 0.0003
  • Target 0 Mse: 0.0008
  • Target 0 Distributions: <wandb.sdk.data_types.image.Image object at 0x7f15100ee7d0>
  • Target 0 Error Distribution: <wandb.sdk.data_types.image.Image object at 0x7f14c00f8d30>
  • Target 1 Mse: 0.0003
  • Target 1 Distributions: <wandb.sdk.data_types.image.Image object at 0x7f14b80baf80>
  • Target 1 Error Distribution: <wandb.sdk.data_types.image.Image object at 0x7f148453fdf0>
  • Target 2 Mse: 0.0000
  • Target 2 Distributions: <wandb.sdk.data_types.image.Image object at 0x7f14845e9ba0>
  • Target 2 Error Distribution: <wandb.sdk.data_types.image.Image object at 0x7f148460e3e0>
  • Target 3 Mse: 0.0000
  • Target 3 Distributions: <wandb.sdk.data_types.image.Image object at 0x7f14845e9e10>
  • Target 3 Error Distribution: <wandb.sdk.data_types.image.Image object at 0x7f148439a4a0>

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 256
  • total_eval_batch_size: 128
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss Mse Target 0 Mse Target 0 Distributions Target 0 Error Distribution Target 1 Mse Target 1 Distributions Target 1 Error Distribution Target 2 Mse Target 2 Distributions Target 2 Error Distribution Target 3 Mse Target 3 Distributions Target 3 Error Distribution
1.3816 1.0 1282 1.4598 0.0003 0.0008 <wandb.sdk.data_types.image.Image object at 0x7f1511421c00> <wandb.sdk.data_types.image.Image object at 0x7f1511422b00> 0.0003 <wandb.sdk.data_types.image.Image object at 0x7f1521bf0910> <wandb.sdk.data_types.image.Image object at 0x7f1521e65570> 0.0000 <wandb.sdk.data_types.image.Image object at 0x7f1521e2ff70> <wandb.sdk.data_types.image.Image object at 0x7f151165da20> 0.0000 <wandb.sdk.data_types.image.Image object at 0x7f1521bf32b0> <wandb.sdk.data_types.image.Image object at 0x7f151165e2c0>
1.5242 2.0 2564 1.4238 0.0003 0.0008 <wandb.sdk.data_types.image.Image object at 0x7f1520347850> <wandb.sdk.data_types.image.Image object at 0x7f1520125990> 0.0003 <wandb.sdk.data_types.image.Image object at 0x7f14fe3a38b0> <wandb.sdk.data_types.image.Image object at 0x7f1520347790> 0.0000 <wandb.sdk.data_types.image.Image object at 0x7f1511896ad0> <wandb.sdk.data_types.image.Image object at 0x7f15117b0c40> 0.0000 <wandb.sdk.data_types.image.Image object at 0x7f14fe445ab0> <wandb.sdk.data_types.image.Image object at 0x7f14fe677400>
0.7842 3.0 3846 1.5382 0.0003 0.0008 <wandb.sdk.data_types.image.Image object at 0x7f14fe5d4850> <wandb.sdk.data_types.image.Image object at 0x7f151014aad0> 0.0003 <wandb.sdk.data_types.image.Image object at 0x7f14fe5fc8b0> <wandb.sdk.data_types.image.Image object at 0x7f15100bfcd0> 0.0000 <wandb.sdk.data_types.image.Image object at 0x7f15100a5bd0> <wandb.sdk.data_types.image.Image object at 0x7f14fe55a3e0> 0.0000 <wandb.sdk.data_types.image.Image object at 0x7f14fe1ecd90> <wandb.sdk.data_types.image.Image object at 0x7f14fe296d40>

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.0.1
  • Tokenizers 0.21.0
Downloads last month
5
Safetensors
Model size
0.2B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AlekseyKorshuk/twscrape-prepared-regression-NeoBERT-3epochs

Finetuned
(10)
this model