TianshengHuang's picture

2 4 2

TianshengHuang

TianshengHuang

·

https://huangtiansheng.github.io/

AI & ML interests

LLM safety

Recent Activity

new activity about 2 months ago

AnonymousUser000/JALMBench:Advwave split is now accessible

liked a dataset about 2 months ago

xinykou/EduHarm

upvoted a paper 3 months ago

AgentReview: Exploring Peer Review Dynamics with LLM Agents

View all activity

Organizations

New activity in AnonymousUser000/JALMBench about 2 months ago

Advwave split is now accessible

#2 opened about 2 months ago by

commented 2 papers 11 months ago

Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation

Paper • 2501.17433 • Published Jan 29 • 10 •

Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation

Paper • 2501.17433 • Published Jan 29 • 10 •