D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models Paper • 2509.17938 • Published Sep 22 • 3
D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models Paper • 2509.17938 • Published Sep 22 • 3 • 2
Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework Paper • 2507.06260 • Published Jul 7 • 5
Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework Paper • 2507.06260 • Published Jul 7 • 5 • 1