17 5 5

Satya

skrishna

https://satyapriyakrishna.com/

AI & ML interests

Safe A(G)I

Recent Activity

upvoted a paper about 1 month ago

D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models

commented on a paper about 1 month ago

D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models

liked a dataset about 2 months ago

google/frames-benchmark

View all activity

Organizations

upvoted a paper about 1 month ago

D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models

Paper • 2509.17938 • Published Sep 22 • 3

commented a paper about 1 month ago

D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models

Paper • 2509.17938 • Published Sep 22 • 3 •

liked a dataset about 2 months ago

google/frames-benchmark

Viewer • Updated Oct 15, 2024 • 824 • 10.9k • 229

updated a model 3 months ago

skrishna/smolm-toxicity-classifier

Text Classification • 0.1B • Updated Aug 15 • 1

upvoted a paper 3 months ago

Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework

Paper • 2507.06260 • Published Jul 7 • 5

commented a paper 4 months ago

Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework

Paper • 2507.06260 • Published Jul 7 • 5 •

updated a model 5 months ago

skrishna/sft-ref-policy-copy

Text Generation • 0.1B • Updated Jun 18

published a model 5 months ago

skrishna/sft-ref-policy-copy

Text Generation • 0.1B • Updated Jun 18

updated a model 5 months ago

skrishna/sft-model-copy

Text Generation • 0.1B • Updated Jun 18

published 2 models 5 months ago

skrishna/sft-model-copy

Text Generation • 0.1B • Updated Jun 18

skrishna/smolm-toxicity-classifier

Text Classification • 0.1B • Updated Aug 15 • 1

updated a dataset 5 months ago

skrishna/toxigen_annotated_mod

Viewer • Updated May 25 • 8.96k • 8

published a dataset 5 months ago

skrishna/toxigen_annotated_mod

Viewer • Updated May 25 • 8.96k • 8

updated a dataset 5 months ago

skrishna/toy-toxicity-dataset

Viewer • Updated May 22 • 40k • 3

updated a dataset 6 months ago

skrishna/toxicity-reward-dataset

Viewer • Updated May 16 • 40k • 4

published a dataset 6 months ago

skrishna/toxicity-reward-dataset

Viewer • Updated May 16 • 40k • 4

updated a model 6 months ago

skrishna/gpt2-toxicity-classifier

Updated May 16

published a model 6 months ago

skrishna/gpt2-toxicity-classifier

Updated May 16

published a dataset 6 months ago

skrishna/toy-toxicity-dataset

Viewer • Updated May 22 • 40k • 3

updated a model 6 months ago

skrishna/gpt2-fineweb-soap-20250422_112211

Text Generation • 0.1B • Updated Apr 22 • 2

Satya

AI & ML interests

Recent Activity

Organizations

skrishna's activity