Collections
Discover the best community collections!
Collections including paper arxiv:2502.06589 
						
					
				- 
	
	
	LLM Pruning and Distillation in Practice: The Minitron ApproachPaper • 2408.11796 • Published • 57
- 
	
	
	TableBench: A Comprehensive and Complex Benchmark for Table Question AnsweringPaper • 2408.09174 • Published • 52
- 
	
	
	To Code, or Not To Code? Exploring Impact of Code in Pre-trainingPaper • 2408.10914 • Published • 43
- 
	
	
	Open-FinLLMs: Open Multimodal Large Language Models for Financial ApplicationsPaper • 2408.11878 • Published • 63
- 
	
	
	Getting it Right: Improving Spatial Consistency in Text-to-Image ModelsPaper • 2404.01197 • Published • 31
- 
	
	
	CosmicMan: A Text-to-Image Foundation Model for HumansPaper • 2404.01294 • Published • 17
- 
	
	
	mOSCAR: A Large-scale Multilingual and Multimodal Document-level CorpusPaper • 2406.08707 • Published • 17
- 
	
	
	DataComp-LM: In search of the next generation of training sets for language modelsPaper • 2406.11794 • Published • 54
- 
	
	
	AgentOhana: Design Unified Data and Training Pipeline for Effective Agent LearningPaper • 2402.15506 • Published • 18
- 
	
	
	AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating AgentPaper • 2404.03648 • Published • 29
- 
	
	
	Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered ThoughtsPaper • 2405.19893 • Published • 33
- 
	
	
	Parrot: Efficient Serving of LLM-based Applications with Semantic VariablePaper • 2405.19888 • Published • 7
- 
	
	
	Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-TrainingPaper • 2502.06589 • Published • 20
- 
	
	
	LIMO: Less is More for ReasoningPaper • 2502.03387 • Published • 62
- 
	
	
	Control LLM: Controlled Evolution for Intelligence Retention in LLMPaper • 2501.10979 • Published • 6
- 
	
	
	LLMQuoter: Enhancing RAG Capabilities Through Efficient Quote Extraction From Large ContextsPaper • 2501.05554 • Published • 1
- 
	
	
	FLAME: Factuality-Aware Alignment for Large Language ModelsPaper • 2405.01525 • Published • 28
- 
	
	
	DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic DataPaper • 2405.14333 • Published • 41
- 
	
	
	Transformers Can Do Arithmetic with the Right EmbeddingsPaper • 2405.17399 • Published • 54
- 
	
	
	EasyAnimate: A High-Performance Long Video Generation Method based on Transformer ArchitecturePaper • 2405.18991 • Published • 12
- 
	
	
	TheAgentCompany: Benchmarking LLM Agents on Consequential Real World TasksPaper • 2412.14161 • Published • 51
- 
	
	
	Training Software Engineering Agents and Verifiers with SWE-GymPaper • 2412.21139 • Published • 24
- 
	
	
	OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task SynthesisPaper • 2412.19723 • Published • 87
- 
	
	
	AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task GenerationPaper • 2408.00764 • Published • 1
- 
	
	
	Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-TrainingPaper • 2502.06589 • Published • 20
- 
	
	
	LIMO: Less is More for ReasoningPaper • 2502.03387 • Published • 62
- 
	
	
	Control LLM: Controlled Evolution for Intelligence Retention in LLMPaper • 2501.10979 • Published • 6
- 
	
	
	LLMQuoter: Enhancing RAG Capabilities Through Efficient Quote Extraction From Large ContextsPaper • 2501.05554 • Published • 1
- 
	
	
	LLM Pruning and Distillation in Practice: The Minitron ApproachPaper • 2408.11796 • Published • 57
- 
	
	
	TableBench: A Comprehensive and Complex Benchmark for Table Question AnsweringPaper • 2408.09174 • Published • 52
- 
	
	
	To Code, or Not To Code? Exploring Impact of Code in Pre-trainingPaper • 2408.10914 • Published • 43
- 
	
	
	Open-FinLLMs: Open Multimodal Large Language Models for Financial ApplicationsPaper • 2408.11878 • Published • 63
- 
	
	
	FLAME: Factuality-Aware Alignment for Large Language ModelsPaper • 2405.01525 • Published • 28
- 
	
	
	DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic DataPaper • 2405.14333 • Published • 41
- 
	
	
	Transformers Can Do Arithmetic with the Right EmbeddingsPaper • 2405.17399 • Published • 54
- 
	
	
	EasyAnimate: A High-Performance Long Video Generation Method based on Transformer ArchitecturePaper • 2405.18991 • Published • 12
- 
	
	
	Getting it Right: Improving Spatial Consistency in Text-to-Image ModelsPaper • 2404.01197 • Published • 31
- 
	
	
	CosmicMan: A Text-to-Image Foundation Model for HumansPaper • 2404.01294 • Published • 17
- 
	
	
	mOSCAR: A Large-scale Multilingual and Multimodal Document-level CorpusPaper • 2406.08707 • Published • 17
- 
	
	
	DataComp-LM: In search of the next generation of training sets for language modelsPaper • 2406.11794 • Published • 54
- 
	
	
	TheAgentCompany: Benchmarking LLM Agents on Consequential Real World TasksPaper • 2412.14161 • Published • 51
- 
	
	
	Training Software Engineering Agents and Verifiers with SWE-GymPaper • 2412.21139 • Published • 24
- 
	
	
	OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task SynthesisPaper • 2412.19723 • Published • 87
- 
	
	
	AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task GenerationPaper • 2408.00764 • Published • 1
- 
	
	
	AgentOhana: Design Unified Data and Training Pipeline for Effective Agent LearningPaper • 2402.15506 • Published • 18
- 
	
	
	AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating AgentPaper • 2404.03648 • Published • 29
- 
	
	
	Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered ThoughtsPaper • 2405.19893 • Published • 33
- 
	
	
	Parrot: Efficient Serving of LLM-based Applications with Semantic VariablePaper • 2405.19888 • Published • 7
