Ariel Kwiatkowski
RedTachyon
AI & ML interests
RL, MARL, Crowd Simulation
Recent Activity
upvoted
a
paper
about 2 months ago
Soft Tokens, Hard Truths
upvoted
a
paper
10 months ago
PILAF: Optimal Human Preference Sampling for Reward Modeling
authored
a paper
10 months ago
PILAF: Optimal Human Preference Sampling for Reward Modeling