Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
xieyuquan 's Collections
rlhf
compression
arch
dpo
learning

learning

updated Oct 22, 2024
Upvote
-

  • Law of Vision Representation in MLLMs

    Paper • 2408.16357 • Published Aug 29, 2024 • 95

  • CogVLM2: Visual Language Models for Image and Video Understanding

    Paper • 2408.16500 • Published Aug 29, 2024 • 57

  • Learning to Move Like Professional Counter-Strike Players

    Paper • 2408.13934 • Published Aug 25, 2024 • 23

  • Building and better understanding vision-language models: insights and future directions

    Paper • 2408.12637 • Published Aug 22, 2024 • 133

  • Towards a Unified View of Preference Learning for Large Language Models: A Survey

    Paper • 2409.02795 • Published Sep 4, 2024 • 72

  • Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

    Paper • 2409.08264 • Published Sep 12, 2024 • 48

  • Qwen2.5-Coder Technical Report

    Paper • 2409.12186 • Published Sep 18, 2024 • 152

  • Training Language Models to Self-Correct via Reinforcement Learning

    Paper • 2409.12917 • Published Sep 19, 2024 • 141

  • RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning

    Paper • 2409.14674 • Published Sep 23, 2024 • 43

  • Baichuan Alignment Technical Report

    Paper • 2410.14940 • Published Oct 19, 2024 • 51
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs