Cross-Architecture Transfer Learning for Linear-Cost Inference Transformers Paper • 2404.02684 • Published Apr 3, 2024
Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset Paper • 2109.07679 • Published Sep 16, 2021
AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph Paper • 2311.09174 • Published Nov 15, 2023
AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility Estimation Paper • 2402.10646 • Published Feb 16, 2024
CKBP v2: Better Annotation and Reasoning for Commonsense Knowledge Base Population Paper • 2304.10392 • Published Apr 20, 2023
Parameter-Efficient Checkpoint Merging via Metrics-Weighted Averaging Paper • 2504.18580 • Published Apr 23
Running on CPU Upgrade 233 233 MMLU-Pro Leaderboard 🥇 More advanced and challenging multi-task evaluation
KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection Paper • 2310.09044 • Published Oct 13, 2023
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 172 • 34
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 172 • 34
Running on CPU Upgrade 13.6k 13.6k Open LLM Leaderboard 🏆 Track, rank and evaluate open LLMs and chatbots