Oren Data Distillation Experiment Collection Two identical d10 models (100M params) trained to validate the hypothesis that quality-filtered data enables more efficient training. • 2 items • Updated Nov 1 • 1
Oren Data Distillation Experiment Collection Two identical d10 models (100M params) trained to validate the hypothesis that quality-filtered data enables more efficient training. • 2 items • Updated Nov 1 • 1