TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling Paper • 2508.17445 • Published Aug 24 • 80
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization Paper • 2403.17031 • Published Mar 24, 2024 • 6
TIGERScore Collection List of model variates of TIGEREScore checkpoints and the associated dataset • 8 items • Updated Sep 26, 2024 • 5
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Paper • 2405.19327 • Published May 29, 2024 • 48
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2, 2024 • 107
justin6667/vit-base-patch16-224-in21k-finetuned-lora-food101 Image Classification • 85.9M • Updated Feb 15, 2024 • 3