finetuned smol 220M
					Collection
				
smol_llama 220M fine-tunes we did
					• 
				6 items
				• 
				Updated
					
				•
					
					2
This is BEE-spoke-data/smol_llama-220M-GQA fine-tuned for code generation on:
This model (and the base model) were both trained using ctx length 2048.
Example script for inference testing: here
It has its limitations at 220M, but seems decent for single-line or docstring generation, and/or being used for speculative decoding for such purposes.
The screenshot is on CPU on a laptop.