clem HF Staff commited on
Commit
5743740
·
verified ·
1 Parent(s): ac9c66c

Adding mention of Tinker and TRL support

Browse files

cc

@devendrachaplot


@cdq10131


@sshleifer


@qgallouedec

Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -317,6 +317,11 @@ We test the model on an 1M version of the [RULER](https://arxiv.org/abs/2404.066
317
  * All models are evaluated with Dual Chunk Attention enabled.
318
  * Since the evaluation is time-consuming, we use 260 samples for each length (13 sub-tasks, 20 samples for each).
319
 
 
 
 
 
 
320
  ## Best Practices
321
 
322
  To achieve optimal performance, we recommend the following settings:
 
317
  * All models are evaluated with Dual Chunk Attention enabled.
318
  * Since the evaluation is time-consuming, we use 260 samples for each length (13 sub-tasks, 20 samples for each).
319
 
320
+ ## Fine Tuning
321
+
322
+ Qwen 3 is compatible with [TRL](https://github.com/huggingface/trl) and [Tinker](https://thinkingmachines.ai/tinker/).
323
+
324
+
325
  ## Best Practices
326
 
327
  To achieve optimal performance, we recommend the following settings: