 stereoplegic
			's Collections
			stereoplegic
			's Collections
			
			
		Context compression
		
	updated
			
 
				
				
 - In-Context Learning Creates Task Vectors- 
			Paper
			 •- 
			2310.15916
			 •
			Published
				
			•- 
				43
			 
 - When can transformers reason with abstract symbols?- 
			Paper
			 •- 
			2310.09753
			 •
			Published
				
			•- 
				4
			 
 - Improving Length-Generalization in Transformers via Task Hinting- 
			Paper
			 •- 
			2310.00726
			 •
			Published
				
			•- 
				1
			 
 - In-context Autoencoder for Context Compression in a Large Language Model- 
			Paper
			 •- 
			2307.06945
			 •
			Published
				
			•- 
				28
			 
 - Adapting Language Models to Compress Contexts- 
			Paper
			 •- 
			2305.14788
			 •
			Published
				
			•- 
				1
			 
 - Context Compression for Auto-regressive Transformers with Sentinel
  Tokens- 
			Paper
			 •- 
			2310.08152
			 •
			Published
				
			•- 
				1
			 
 - Learning to Compress Prompts with Gist Tokens- 
			Paper
			 •- 
			2304.08467
			 •
			Published
				
			•- 
				3
			 
 - Dynamic Context Pruning for Efficient and Interpretable Autoregressive
  Transformers- 
			Paper
			 •- 
			2305.15805
			 •
			Published
				
			•- 
				1
			 
 - Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM
  Inference with Transferable Prompt- 
			Paper
			 •- 
			2305.11186
			 •
			Published
				
			•- 
				1
			 
 - Self-slimmed Vision Transformer- 
			Paper
			 •- 
			2111.12624
			 •
			Published
				
			•- 
				1
			 
 - Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time- 
			Paper
			 •- 
			2310.17157
			 •
			Published
				
			•- 
				14
			 
 - RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective
  Augmentation- 
			Paper
			 •- 
			2310.04408
			 •
			Published
				
			•- 
				1
			 
 - Dynamic Token Pruning in Plain Vision Transformers for Semantic
  Segmentation- 
			Paper
			 •- 
			2308.01045
			 •
			Published
				
			•- 
				1
			 
 - Adaptive Token Sampling For Efficient Vision Transformers- 
			Paper
			 •- 
			2111.15667
			 •
			Published
				
			•- 
				1
			 
 - Dynamic Token-Pass Transformers for Semantic Segmentation- 
			Paper
			 •- 
			2308.01944
			 •
			Published
				
			•- 
				1
			 
 - Multi-Scale And Token Mergence: Make Your ViT More Efficient- 
			Paper
			 •- 
			2306.04897
			 •
			Published
				
			•- 
				1
			 
 - Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient
  Vision Transformers- 
			Paper
			 •- 
			2303.13755
			 •
			Published
				
			•- 
				1
			 
 - Nugget 2D: Dynamic Contextual Compression for Scaling Decoder-only
  Language Models- 
			Paper
			 •- 
			2310.02409
			 •
			Published
				
			•- 
				1
			 
 - LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive
  Prompt-Based Few-Shot Fine-Tuning- 
			Paper
			 •- 
			2305.18169
			 •
			Published
				
			•- 
				1
			 
 - ComputeGPT: A computational chat model for numerical problems- 
			Paper
			 •- 
			2305.06223
			 •
			Published
				
			•- 
				1
			 
 - XPrompt: Exploring the Extreme of Prompt Tuning- 
			Paper
			 •- 
			2210.04457
			 •
			Published
				
			•- 
				1
			 
 - Reducing Sequence Length by Predicting Edit Operations with Large
  Language Models- 
			Paper
			 •- 
			2305.11862
			 •
			Published
				
			•- 
				1
			 
 - Diet Code Is Healthy: Simplifying Programs for Pre-trained Models of
  Code- 
			Paper
			 •- 
			2206.14390
			 •
			Published
				
			•- 
				1
			 
 - Split, Encode and Aggregate for Long Code Search- 
			Paper
			 •- 
			2208.11271
			 •
			Published
				
			•- 
				1
			 
 - Recursively Summarizing Enables Long-Term Dialogue Memory in Large
  Language Models- 
			Paper
			 •- 
			2308.15022
			 •
			Published
				
			•- 
				3
			 
 - Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing
  Important Tokens- 
			Paper
			 •- 
			2305.04241
			 •
			Published
				
			•- 
				1
			 
 - Latency Adjustable Transformer Encoder for Language Understanding- 
			Paper
			 •- 
			2201.03327
			 •
			Published
				
			•- 
				1
			 
 - Block-Skim: Efficient Question Answering for Transformer- 
			Paper
			 •- 
			2112.08560
			 •
			Published
				
			•- 
				1
			 
 - Learned Token Pruning for Transformers- 
			Paper
			 •- 
			2107.00910
			 •
			Published
				
			•- 
				1
			 
 - Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention
  Graph in Pre-Trained Transformers- 
			Paper
			 •- 
			2305.17328
			 •
			Published
				
			•- 
				2
			 
 - Learned Thresholds Token Merging and Pruning for Vision Transformers- 
			Paper
			 •- 
			2307.10780
			 •
			Published
				
			•- 
				1
			 
 - Can the Inference Logic of Large Language Models be Disentangled into
  Symbolic Concepts?- 
			Paper
			 •- 
			2304.01083
			 •
			Published
				
			•- 
				1
			 
 - System 2 Attention (is something you might need too)- 
			Paper
			 •- 
			2311.11829
			 •
			Published
				
			•- 
				44
			 
 - CoLT5: Faster Long-Range Transformers with Conditional Computation- 
			Paper
			 •- 
			2303.09752
			 •
			Published
				
			•- 
				2
			 
 - Random-LTD: Random and Layerwise Token Dropping Brings Efficient
  Training for Large-scale Transformers- 
			Paper
			 •- 
			2211.11586
			 •
			Published
				
			•- 
				1
			 
 - TCRA-LLM: Token Compression Retrieval Augmented Large Language Model for
  Inference Cost Reduction- 
			Paper
			 •- 
			2310.15556
			 •
			Published
				
			•- 
				1
			 
 - Extending Context Window of Large Language Models via Semantic
  Compression- 
			Paper
			 •- 
			2312.09571
			 •
			Published
				
			•- 
				16
			 
 - LLoCO: Learning Long Contexts Offline- 
			Paper
			 •- 
			2404.07979
			 •
			Published
				
			•- 
				22
			 
 - SelfCP: Compressing Long Prompt to 1/12 Using the Frozen Large Language
  Model Itself- 
			Paper
			 •- 
			2405.17052
			 •
			Published
				
			•- 
				2
			 
 - Equipping Transformer with Random-Access Reading for Long-Context
  Understanding- 
			Paper
			 •- 
			2405.13216
			 •
			Published
				
			•- 
				1