arxiv:2503.05731
Satya
skrishna
AI & ML interests
Safe A(G)I
Recent Activity
upvoted
a
paper
about 1 month ago
D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language
Models
commented on
a paper
about 1 month ago
D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language
Models
liked
a dataset
about 1 month ago
google/frames-benchmark