Trained on 2.6M samples for 7 epochs with linear decreasing learnrate and a warmup of 2000 steps.
Files info
Base model