add AIBOM
#18 opened 5 months ago
		by
		
				
							
						RiccardoDav
	
ds-v2-chat
#17 opened 6 months ago
		by
		
				
							
						Elon7111
	
NAN issue using FP16 to load the model
#15 opened about 1 year ago
		by
		
				
							
						joeltseng
	
ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run `pip install flash_attn`
👍
							
						3
				#14 opened over 1 year ago
		by
		
				
							
						kang1
	
How much memory is needed if you make the 128k context length
								1
#13 opened over 1 year ago
		by
		
				
							
						ggbondcxk
	
Implement MLA inference optimizations to DeepseekV2Attention
🤗
							🔥
							
						7
				#12 opened over 1 year ago
		by
		
				
							
						sy-chen
	
Can you provide a sample code for training with DeepSpeed ZeRO3?
								2
#10 opened over 1 year ago
		by
		
				
							
						SupercarryNg
	
Ollama support
👍
							
						1
				
								1
#9 opened over 1 year ago
		by
		
				
							
						Dao3
	
MoE offloading strategy?
								2
#8 opened over 1 year ago
		by
		
				
							
						Minami-su
	
Update README.md
#7 opened over 1 year ago
		by
		
				
							
						VanishingPsychopath
	
kv cache
👀
							
						2
				
								3
#6 opened over 1 year ago
		by
		
				
							
						FrankWu
	
function/tool calling support
								8
#5 opened over 1 year ago
		by
		
				
							
						kaijietti
	
fail to run the example
									8
	#4 opened over 1 year ago
		by
		
				
							
						Leymore
	
GPTQ plz
								10
#3 opened over 1 year ago
		by
		
				
							
						Parkerlambert123
	
vllm support
									7
	#2 opened over 1 year ago
		by
		
				
							
						Sihangli
	
llama.cpp support
👍
							➕
							
						18
				
								5
#1 opened over 1 year ago
		by
		
				
							
						cpumaxx