Tulu3 with distraction mitigation data

groupfairnessllm 's Collections

FiSCo: Evaluating LLM's Group Level Fairness

updated 14 days ago

LLM and LRM can be easily distracted by hidden instructions or irrelevant tasks. We curated SFT and DPO data that model can finetune to avoid distract

Upvote

groupfairnessllm/tulu-3-preference-data-with-distraction

Viewer • Updated 17 days ago • 1.5k • 27
groupfairnessllm/tulu-3-sft-with-distraction

Viewer • Updated 17 days ago • 5.1k • 22 • 1
Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense

Paper • 2510.16259 • Published 26 days ago • 3
allenai/tulu-3-sft-personas-instruction-following

Viewer • Updated Nov 21, 2024 • 30k • 1.32k • 54
allenai/llama-3.1-tulu-3-8b-preference-mixture

Viewer • Updated Feb 4 • 273k • 2.41k • 23

Upvote

Collection guide
Browse collections