-
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 28 -
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Paper • 2406.02900 • Published • 14 -
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Paper • 2406.04151 • Published • 24 -
Understanding and Diagnosing Deep Reinforcement Learning
Paper • 2406.16979 • Published • 10
Yuquan Xie
xieyuquan
AI & ML interests
LLM, multi-modal
Recent Activity
authored
a paper
5 days ago
Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action
Conditioned Policy
authored
a paper
5 days ago
Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable
Task Experts
authored
a paper
5 days ago
Mirage-1: Augmenting and Updating GUI Agent with Hierarchical Multimodal
Skills