Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Chenyan Xiong Research Group at CMU

university
https://www.cs.cmu.edu/~cx/
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

SingularityHJY  updated a dataset 15 days ago
cx-cmu/ClueWeb-Reco
yuzc19  updated a dataset 21 days ago
cx-cmu/repro-organic-data-72B
yuzc19  updated a collection 21 days ago
RePro
View all activity

Papers

RePro: Training Language Models to Faithfully Recycle the Web for Pretraining

View all Papers

Chenyan Xiong's profile picture Cassandra Cohen's profile picture  Zichun Yu's profile picture Jingyuan He's profile picture Mahima Jagadeesh Patel's profile picture zhihan zhang's profile picture Kira Jones's profile picture

cx-cmu 's collections 1

RePro
Space for RePro: Training Language Models to Faithfully Recycle the Web for Pretraining
  • cx-cmu/repro-rephraser-4B

    Text Generation • 196k • Updated 21 days ago • 44 • 1
  • cx-cmu/repro-rl-data

    Viewer • Updated 21 days ago • 41k • 27
  • RePro: Training Language Models to Faithfully Recycle the Web for Pretraining

    Paper • 2510.10681 • Published 26 days ago • 5
  • cx-cmu/repro-rephrased-data-72B

    Viewer • Updated 21 days ago • 39M • 733
RePro
Space for RePro: Training Language Models to Faithfully Recycle the Web for Pretraining
  • cx-cmu/repro-rephraser-4B

    Text Generation • 196k • Updated 21 days ago • 44 • 1
  • cx-cmu/repro-rl-data

    Viewer • Updated 21 days ago • 41k • 27
  • RePro: Training Language Models to Faithfully Recycle the Web for Pretraining

    Paper • 2510.10681 • Published 26 days ago • 5
  • cx-cmu/repro-rephrased-data-72B

    Viewer • Updated 21 days ago • 39M • 733
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs