Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Yizhe Xiong's picture
1 6

Yizhe Xiong

Bostoncake
APYSDH's profile picture
·
  • Bostoncake

AI & ML interests

None yet

Organizations

None yet

authored 7 papers 10 months ago

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

Paper • 2403.09192 • Published Mar 14, 2024

Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

Paper • 2404.17808 • Published Apr 27, 2024

MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts

Paper • 2407.09816 • Published Jul 13, 2024 • 1

LBPE: Long-token-first Tokenization to Improve Large Language Models

Paper • 2411.05504 • Published Nov 8, 2024 • 1

CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts

Paper • 2410.16077 • Published Oct 21, 2024 • 1

Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models

Paper • 2412.07171 • Published Dec 10, 2024 • 1

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published Dec 16, 2024 • 58
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs