DynaGuard: A Dynamic Guardrail Model With User-Defined Policies Paper • 2509.02563 • Published Sep 2, 2025 • 20
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model Paper • 2509.00676 • Published Aug 31, 2025 • 84
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning Paper • 2507.16746 • Published Jul 22, 2025 • 35
PHTest Collection Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models • 3 items • Updated Sep 24, 2024 • 1
Easy2Hard-Bench Collection Easy2Hard-Bench offers six datasets with continuous difficulty ratings, enabling profiling of LLM performance and generalization across difficulties. • 7 items • Updated Jul 3, 2024 • 1
TraceVLA Collection TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies • 4 items • Updated Jan 7, 2025 • 3
WAVES Collection Benchmarking the Robustness of Image Watermarks. Under development. Data will be released soon. • 2 items • Updated Jan 24, 2024 • 3
Recurrent Models Collection These are checkpoints for recurrent LLMs developed to scale test-time compute by recurring in latent space. • 15 items • Updated May 21, 2025 • 11
ARGUS: Hallucination and Omission Evaluation in Video-LLMs Paper • 2506.07371 • Published Jun 9, 2025 • 8
Cross-Modal Safety Alignment: Is textual unlearning all you need? Paper • 2406.02575 • Published May 27, 2024 • 1
Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities Paper • 2502.05209 • Published Feb 3, 2025 • 1
MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning Paper • 2506.05523 • Published Jun 5, 2025 • 34
Has My System Prompt Been Used? Large Language Model Prompt Membership Inference Paper • 2502.09974 • Published Feb 14, 2025 • 9
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7, 2025 • 151
Gemstones: A Model Suite for Multi-Faceted Scaling Laws Paper • 2502.06857 • Published Feb 7, 2025 • 24