2 6 7

Lingyu Li

LingyuLi

lingyuli-cogs

AI & ML interests

None yet

Recent Activity

authored a paper about 2 months ago

Argus Inspection: Do Multimodal Large Language Models Possess the Eye of Panoptes?

liked a dataset about 2 months ago

EVIGBYEN/RigorousBench

upvoted a paper about 2 months ago

A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos

View all activity

Organizations

None yet

authored a paper about 2 months ago

Argus Inspection: Do Multimodal Large Language Models Possess the Eye of Panoptes?

Paper • 2506.14805 • Published Jun 3 • 3

liked a dataset about 2 months ago

EVIGBYEN/RigorousBench

Viewer • Updated Oct 8 • 214 • 98 • 3

upvoted 3 papers about 2 months ago

A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos

Paper • 2502.15806 • Published Feb 19 • 2

A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports

Paper • 2510.02190 • Published Oct 2 • 18

Argus Inspection: Do Multimodal Large Language Models Possess the Eye of Panoptes?

Paper • 2506.14805 • Published Jun 3 • 3

liked a dataset 5 months ago

NousResearch/Hermes-3-Dataset

Viewer • Updated Jul 11 • 959k • 1.1k • 293

liked a Space 7 months ago

Qwen3 Demo

📊

808

Generate responses to text prompts in a chat interface

New activity in meta-llama/Meta-Llama-3-8B 8 months ago

中国区的账号，都会被拒绝？

#124 opened over 1 year ago by

HLearning

liked a dataset 8 months ago

TIGER-Lab/MMLU-Pro

Viewer • Updated Oct 25 • 12.1k • 57.6k • 396

liked a model 8 months ago

CaasiHUANG/flames-scorer

Text Classification • Updated Apr 22, 2024 • 33 • 5

upvoted a paper 9 months ago

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

liked a dataset 9 months ago

CCLV/CausalBench

Preview • Updated Jun 13, 2024 • 105 • 6

upvoted a paper 11 months ago

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published Dec 27, 2024 • 87

liked a dataset about 1 year ago

neuralwork/arxiver

Viewer • Updated Nov 1, 2024 • 63.4k • 715 • 364

upvoted a paper about 1 year ago

Reflection-Bench: probing AI intelligence with reflection

Paper • 2410.16270 • Published Oct 21, 2024 • 6

authored a paper about 1 year ago

ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models

Paper • 2406.14952 • Published Jun 21, 2024

commented a paper about 1 year ago

Reflection-Bench: probing AI intelligence with reflection

Paper • 2410.16270 • Published Oct 21, 2024 • 6 •

authored a paper about 1 year ago

Reflection-Bench: probing AI intelligence with reflection

Paper • 2410.16270 • Published Oct 21, 2024 • 6

Lingyu Li

AI & ML interests

Recent Activity

Organizations

LingyuLi's activity

Qwen3 Demo

中国区的账号， 都会被拒绝？

中国区的账号，都会被拒绝？