Papers - University - Carnegie Mellon University
updated
Can large language models explore in-context?
Paper
• 2403.15371
• Published • 33
Long-context LLMs Struggle with Long In-context Learning
Paper
• 2404.02060
• Published • 37
PIQA: Reasoning about Physical Commonsense in Natural Language
Paper
• 1911.11641
• Published • 5
AQuA: A Benchmarking Tool for Label Quality Assessment
Paper
• 2306.09467
• Published • 1
Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation
Paper
• 2404.04256
• Published • 6
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real
Computer Environments
Paper
• 2404.07972
• Published • 52
Megalodon: Efficient LLM Pretraining and Inference with Unlimited
Context Length
Paper
• 2404.08801
• Published • 66
TriForce: Lossless Acceleration of Long Sequence Generation with
Hierarchical Speculative Decoding
Paper
• 2404.11912
• Published • 17
SpecInfer: Accelerating Generative LLM Serving with Speculative
Inference and Token Tree Verification
Paper
• 2305.09781
• Published • 4
Latent Positional Information is in the Self-Attention Variance of
Transformer Language Models Without Positional Embeddings
Paper
• 2305.13571
• Published • 2
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Paper
• 2403.03234
• Published • 14
Stylus: Automatic Adapter Selection for Diffusion Models
Paper
• 2404.18928
• Published • 15
Are Sixteen Heads Really Better than One?
Paper
• 1905.10650
• Published • 2
Prometheus 2: An Open Source Language Model Specialized in Evaluating
Other Language Models
Paper
• 2405.01535
• Published • 124
An Empirical Evaluation of Columnar Storage Formats
Paper
• 2304.05028
• Published • 1
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
Paper
• 1707.02968
• Published • 1