Taiwan Legal LLMs
Collection
Models trained from our paper "Continual Pre-Training is (not) What You Need in Domain Adaptation."
•
9 items
•
Updated
This model is a fine-tuned version of lopentu/Llama-3-8B-Taiwan-Llawa-TCxYZL-Instruct on an unknown dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.0532 | -0.1124 | -0.0806 | 0.1183 | -0.0318 | -0.8062 | -1.1242 | -0.7965 | -0.8056 | 0.9263 | -1.1339 | -0.6334 |
| 1.027 | 0.9937 | 76 | 1.0078 | -0.0900 | -0.0732 | 0.2985 | -0.0168 | -0.7319 | -0.9002 | -0.4705 | -0.4640 | 0.8982 | -0.9524 | -0.3610 |
| 0.9555 | 1.9873 | 152 | 0.9834 | -0.0812 | -0.0860 | 0.4560 | 0.0048 | -0.8598 | -0.8116 | -0.3296 | -0.3117 | 0.8906 | -0.7776 | 0.0042 |
| 0.9188 | 2.9941 | 229 | 0.9713 | -0.0782 | -0.1002 | 0.5555 | 0.0219 | -1.0017 | -0.7822 | -0.2840 | -0.2493 | 0.8883 | -0.6676 | 0.2898 |
| 0.9388 | 3.9877 | 305 | 0.9677 | -0.0784 | -0.1101 | 0.5971 | 0.0317 | -1.1007 | -0.7836 | -0.2566 | -0.2131 | 0.8887 | -0.6256 | 0.4299 |
| 0.9289 | 4.9683 | 380 | 0.9671 | -0.0787 | -0.1126 | 0.6191 | 0.0339 | -1.1257 | -0.7870 | -0.2509 | -0.2044 | 0.8891 | -0.6148 | 0.4628 |