GGUFs
					Collection
				
Collection of usable GGUFs for running LLMs on the edge or consumer devices like phones & laptops!
					โข 
				3 items
				โข 
				Updated
					
				โข
					
					1
Contains Q4 & Q8 quantized GGUFs for google/gemma
| Variant | Device | Perf | 
|---|---|---|
| Q4 | M1 Pro 10-core GPU | 90 tok/s | 
| Snapdragon 778G CPU | 10 tok/s | |
| RTX 2070S | 40 tok/s | |
| Q8 | M1 Pro 10-core GPU | 54 tok/s | 
| Snapdragon 778G CPU | 6 tok/s | |
| RTX 2070S | 25 tok/s | |
| F16 | M1 Pro 10-core GPU | 30 tok/s | 
| Snapdragon 778G CPU | <1 tok/s | 
4-bit
8-bit
16-bit