Quentin Gallouédec's picture

Quentin Gallouédec PRO

qgallouedec

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

KTO: Model Alignment as Prospect Theoretic Optimization

upvoted an article 3 days ago

20x Faster TRL Fine-tuning with RapidFire AI

updated a dataset 3 days ago

huggingface/documentation-images

View all activity

Organizations

upvoted a paper 1 day ago

KTO: Model Alignment as Prospect Theoretic Optimization

Paper • 2402.01306 • Published Feb 2, 2024 • 20

upvoted an article 3 days ago

Article

20x Faster TRL Fine-tuning with RapidFire AI

3 days ago

•

11

upvoted a paper 12 days ago

Knowledge Distillation of Large Language Models

Paper • 2306.08543 • Published Jun 14, 2023 • 21

upvoted a paper 16 days ago

Model Cards for Model Reporting

Paper • 1810.03993 • Published Oct 5, 2018 • 6

upvoted a paper 18 days ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 271

upvoted a collection 24 days ago

Agent Data Protocol

2 items • Updated 26 days ago • 10

upvoted 3 changelogs about 2 months ago

Changelog

Custom Domains for Spaces

Sep 17

• 82

Changelog

Repositories total file size is now displayed

Sep 18

• 172

Changelog

GGUF Metadata Editor

Oct 7

• 77

upvoted a paper about 2 months ago

ARE: Scaling Up Agent Environments and Evaluations

Paper • 2509.17158 • Published Sep 21 • 35

upvoted an article 2 months ago

Article

Parameter-Efficient Fine-Tuning using 🤗 PEFT

Feb 10, 2023

•

108

upvoted a paper 2 months ago

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 192

upvoted 3 papers 3 months ago

SLiC-HF: Sequence Likelihood Calibration with Human Feedback

Paper • 2305.10425 • Published May 17, 2023 • 6

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Paper • 2508.08221 • Published Aug 11 • 48

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7 • 178

upvoted a collection 3 months ago

Testing datasets

5 items • Updated Aug 18 • 1

upvoted 4 papers 3 months ago

panda-gym: Open-source goal-conditioned environments for robotic learning

Paper • 2106.13687 • Published Jun 25, 2021 • 3

Cell-Free Latent Go-Explore

Paper • 2208.14928 • Published Aug 31, 2022 • 1

Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

Paper • 2402.03046 • Published Feb 5, 2024 • 7

Distributional Preference Alignment of LLMs via Optimal Transport

Paper • 2406.05882 • Published Jun 9, 2024 • 2