SleepWalk: A Three-Tier Benchmark for Stress-Testing Instruction-Guided Vision-Language Navigation Paper • 2605.10376 • Published 11 days ago • 1
SleepWalk: A Three-Tier Benchmark for Stress-Testing Instruction-Guided Vision-Language Navigation Paper • 2605.10376 • Published 11 days ago • 1
SleepWalk: A Three-Tier Benchmark for Stress-Testing Instruction-Guided Vision-Language Navigation Paper • 2605.10376 • Published 11 days ago • 1
Prediction Bottlenecks Don't Discover Causal Structure (But Here's What They Actually Do) Paper • 2605.09169 • Published 13 days ago
Prediction Bottlenecks Don't Discover Causal Structure (But Here's What They Actually Do) Paper • 2605.09169 • Published 13 days ago
Moral Sensitivity in LLMs: A Tiered Evaluation of Contextual Bias via Behavioral Profiling and Mechanistic Interpretability Paper • 2605.03217 • Published 18 days ago • 1
Moral Sensitivity in LLMs: A Tiered Evaluation of Contextual Bias via Behavioral Profiling and Mechanistic Interpretability Paper • 2605.03217 • Published 18 days ago • 1
VISTA: Video Interaction Spatio-Temporal Analysis Benchmark Paper • 2605.01391 • Published 20 days ago
PermaFrost-Attack: Stealth Pretraining Seeding(SPS) for planting Logic Landmines During LLM Training Paper • 2604.22117 • Published 24 days ago
Personality Shapes Gender Bias in Persona-Conditioned LLM Narratives Across English and Hindi: An Empirical Investigation Paper • 2604.23600 • Published 26 days ago • 2
Personality Shapes Gender Bias in Persona-Conditioned LLM Narratives Across English and Hindi: An Empirical Investigation Paper • 2604.23600 • Published 26 days ago • 2
CONSCIENTIA: Can LLM Agents Learn to Strategize? Emergent Deception and Trust in a Multi-Agent NYC Simulation Paper • 2604.09746 • Published Apr 10 • 1
CONSCIENTIA: Can LLM Agents Learn to Strategize? Emergent Deception and Trust in a Multi-Agent NYC Simulation Paper • 2604.09746 • Published Apr 10 • 1
Reasoning or Rhetoric? An Empirical Analysis of Moral Reasoning Explanations in Large Language Models Paper • 2603.21854 • Published Mar 23 • 3
Reasoning or Rhetoric? An Empirical Analysis of Moral Reasoning Explanations in Large Language Models Paper • 2603.21854 • Published Mar 23 • 3
SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement Paper • 2603.06333 • Published Mar 6 • 1
The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness Paper • 2603.09200 • Published Mar 10 • 5
The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness Paper • 2603.09200 • Published Mar 10 • 5
SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement Paper • 2603.06333 • Published Mar 6 • 1
The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness Paper • 2603.09200 • Published Mar 10 • 5