amanda's picture

2

amanda

amandasa

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 18 days ago

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

upvoted a paper 18 days ago

DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based Agent

updated a Space almost 3 years ago

amandasa/TM-TKO-Model-UI

View all activity

Organizations

upvoted 2 papers 18 days ago

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

Paper • 2406.18510 • Published Jun 26, 2024 • 10

DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based Agent

Paper • 2502.12575 • Published Feb 18 • 2