Post
1781
we've just added several example scripts to TRL showing how to train models with GRPO using some of the new OpenEnv environments
train a model to interact with a browser (๐ฎ BrowserGym Env), play Wordle (๐ฎ Wordle Env) and moooore!
TRL (GRPO + vLLM) + OpenEnv! โก๏ธ
๐ go play with them: https://github.com/huggingface/trl/tree/main/examples/scripts/openenv
๐ examples list: https://huggingface.co/docs/trl/main/en/example_overview#scripts
train a model to interact with a browser (๐ฎ BrowserGym Env), play Wordle (๐ฎ Wordle Env) and moooore!
TRL (GRPO + vLLM) + OpenEnv! โก๏ธ
๐ go play with them: https://github.com/huggingface/trl/tree/main/examples/scripts/openenv
๐ examples list: https://huggingface.co/docs/trl/main/en/example_overview#scripts