Prasad Thammineni
prasadt2
		·
				AI & ML interests
None yet
		
		Organizations
None yet
Generative UI
			
			
	
	Screen agents
			
			
	
	- 
	
	
	MobA: A Two-Level Agent System for Efficient Mobile Task AutomationPaper • 2410.13757 • Published • 33
- 
	
	
	Agent S: An Open Agentic Framework that Uses Computers Like a HumanPaper • 2410.08164 • Published • 26
- 
	
	
	WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic ExplorationPaper • 2408.15978 • Published
- 
	
	
	Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based AgentsPaper • 2409.17140 • Published
LAMs
			
			
	
	Trained models
			
			
	
	Memory
			
			
	
	Voice agents 
			
			
	
	- 
	
	
	Ichigo: Mixed-Modal Early-Fusion Realtime Voice AssistantPaper • 2410.15316 • Published • 12
- 
	
	
	MMAU: A Massive Multi-Task Audio Understanding and Reasoning BenchmarkPaper • 2410.19168 • Published • 23
- 
	
	
	LLM-Powered GUI Agents in Phone Automation: Surveying Progress and ProspectsPaper • 2504.19838 • Published • 22
Reasoning 
			
			
	
	- 
	
	
	A Comparative Study on Reasoning Patterns of OpenAI's o1 ModelPaper • 2410.13639 • Published • 19
- 
	
	
	MobA: A Two-Level Agent System for Efficient Mobile Task AutomationPaper • 2410.13757 • Published • 33
- 
	
	
	Marco-o1: Towards Open Reasoning Models for Open-Ended SolutionsPaper • 2411.14405 • Published • 61
- 
	
	
	Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTSPaper • 2411.18478 • Published • 37
Agents
			
			
	
	- 
	
	
	τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World DomainsPaper • 2406.12045 • Published • 9
- 
	
	
	Natural Language Reinforcement LearningPaper • 2411.14251 • Published • 31
- 
	
	
	SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step DiffusionPaper • 2412.04301 • Published • 41
- 
	
	
	IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI SystemsPaper • 2501.11067 • Published • 13
Datasets
			
			
	
	RAG
			
			
	
	Memory
			
			
	
	Generative UI
			
			
	
	Voice agents 
			
			
	
	- 
	
	
	Ichigo: Mixed-Modal Early-Fusion Realtime Voice AssistantPaper • 2410.15316 • Published • 12
- 
	
	
	MMAU: A Massive Multi-Task Audio Understanding and Reasoning BenchmarkPaper • 2410.19168 • Published • 23
- 
	
	
	LLM-Powered GUI Agents in Phone Automation: Surveying Progress and ProspectsPaper • 2504.19838 • Published • 22
Screen agents
			
			
	
	- 
	
	
	MobA: A Two-Level Agent System for Efficient Mobile Task AutomationPaper • 2410.13757 • Published • 33
- 
	
	
	Agent S: An Open Agentic Framework that Uses Computers Like a HumanPaper • 2410.08164 • Published • 26
- 
	
	
	WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic ExplorationPaper • 2408.15978 • Published
- 
	
	
	Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based AgentsPaper • 2409.17140 • Published
Reasoning 
			
			
	
	- 
	
	
	A Comparative Study on Reasoning Patterns of OpenAI's o1 ModelPaper • 2410.13639 • Published • 19
- 
	
	
	MobA: A Two-Level Agent System for Efficient Mobile Task AutomationPaper • 2410.13757 • Published • 33
- 
	
	
	Marco-o1: Towards Open Reasoning Models for Open-Ended SolutionsPaper • 2411.14405 • Published • 61
- 
	
	
	Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTSPaper • 2411.18478 • Published • 37
LAMs
			
			
	
	Agents
			
			
	
	- 
	
	
	τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World DomainsPaper • 2406.12045 • Published • 9
- 
	
	
	Natural Language Reinforcement LearningPaper • 2411.14251 • Published • 31
- 
	
	
	SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step DiffusionPaper • 2412.04301 • Published • 41
- 
	
	
	IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI SystemsPaper • 2501.11067 • Published • 13
Trained models
			
			
	
	Datasets
			
			
	
	