TrainingLLM_QwenChat

Files changed (6) hide show

README.md CHANGED Viewed

@@ -15,8 +15,6 @@ should probably proofread and complete it, then remove this comment. -->
 # results
 This model is a fine-tuned version of [Qwen/Qwen1.5-0.5B-Chat](https://huggingface.co/Qwen/Qwen1.5-0.5B-Chat) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.9931
 ## Model description
@@ -37,7 +35,7 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - train_batch_size: 4
-- eval_batch_size: 1
 - seed: 42
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 8
@@ -45,18 +43,17 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 500
 - num_epochs: 1
-- mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 1.5387        | 1.0   | 100  | 0.9931          |
 ### Framework versions
-- Transformers 4.54.0
-- Pytorch 2.6.0+cu124
 - Datasets 4.0.0
-- Tokenizers 0.21.2

 # results
 This model is a fine-tuned version of [Qwen/Qwen1.5-0.5B-Chat](https://huggingface.co/Qwen/Qwen1.5-0.5B-Chat) on the None dataset.
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - train_batch_size: 4
+- eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 8
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 500
 - num_epochs: 1
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 1.0   | 20   | 1.2083          |
 ### Framework versions
+- Transformers 4.53.0
+- Pytorch 2.7.1
 - Datasets 4.0.0
+- Tokenizers 0.21.1

config.json CHANGED Viewed

@@ -47,7 +47,7 @@
   "sliding_window": null,
   "tie_word_embeddings": true,
   "torch_dtype": "float32",
-  "transformers_version": "4.54.0",
   "use_cache": true,
   "use_sliding_window": false,
   "vocab_size": 151936

   "sliding_window": null,
   "tie_word_embeddings": true,
   "torch_dtype": "float32",
+  "transformers_version": "4.53.0",
   "use_cache": true,
   "use_sliding_window": false,
   "vocab_size": 151936

generation_config.json CHANGED Viewed

@@ -8,5 +8,5 @@
   "pad_token_id": 151643,
   "repetition_penalty": 1.1,
   "top_p": 0.8,
-  "transformers_version": "4.54.0"
 }

   "pad_token_id": 151643,
   "repetition_penalty": 1.1,
   "top_p": 0.8,
+  "transformers_version": "4.53.0"
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:64fe6c7cdb66c2a97ae7f185335d03bbe90a41190782e5efe87fb434555e51be
 size 1855983640

 version https://git-lfs.github.com/spec/v1
+oid sha256:17013c39761a50da0d7b809e2aabacecb4b94e2918ad5ba40d096fbc57af931b
 size 1855983640

tokenizer.json CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7c6f32fb0a832e7efb2c2de5e805c8aaaf43e933c191ffc8d7cb56b176e0f11b
-size 11418364

 version https://git-lfs.github.com/spec/v1
+oid sha256:bcfe42da0a4497e8b2b172c1f9f4ec423a46dc12907f4349c55025f670422ba9
+size 11418266

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c557ea3dc1b3966960f9c42696ad0ef976b9a6a4aa4aee26c6e09820beb0b941
-size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:143216466ed2a323370837a0bd467cb9c9b7c526ef4cbd7580341e02ee9dab12
+size 5649