Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
groupfairnessllm 's Collections
Tulu3 with distraction mitigation data
FiSCo: Evaluating LLM's Group Level Fairness

Tulu3 with distraction mitigation data

updated 14 days ago

LLM and LRM can be easily distracted by hidden instructions or irrelevant tasks. We curated SFT and DPO data that model can finetune to avoid distract

Upvote
2

  • groupfairnessllm/tulu-3-preference-data-with-distraction

    Viewer • Updated 17 days ago • 1.5k • 27

  • groupfairnessllm/tulu-3-sft-with-distraction

    Viewer • Updated 17 days ago • 5.1k • 22 • 1

  • Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense

    Paper • 2510.16259 • Published 26 days ago • 3

  • allenai/tulu-3-sft-personas-instruction-following

    Viewer • Updated Nov 21, 2024 • 30k • 1.32k • 54

  • allenai/llama-3.1-tulu-3-8b-preference-mixture

    Viewer • Updated Feb 4 • 273k • 2.41k • 23
Upvote
2
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs