A collection of training corpus and models for "Multilingual Language Model Pretraining using Machine-translated Data".
			
	
	BritLLM
community
						
						
						
						AI & ML interests
Recent Activity
	View all activity
	
			datasets
			18
		
			
	
	
	
	
	britllm/TransWebEdu
	
				Updated
					
				
	
				• 
					
					503
				
				• 
					
					2
				
britllm/TransWeb-Edu-English
			Viewer
			• 
	
				Updated
					
				• 
			
			36M
	
				• 
					
					943
				
				
				
britllm/TransWeb-Edu-Spanish
			Viewer
			• 
	
				Updated
					
				• 
			
			35.2M
	
				• 
					
					1.06k
				
				• 
					
					3
				
britllm/TransWeb-Edu-French
			Viewer
			• 
	
				Updated
					
				• 
			
			36M
	
				• 
					
					3.28k
				
				
				
britllm/TransWeb-Edu-German
			Viewer
			• 
	
				Updated
					
				• 
			
			36M
	
				• 
					
					3.96k
				
				• 
					
					1
				
britllm/xnli_brit
			Viewer
			• 
	
				Updated
					
				• 
			
			9.69k
	
				• 
					
					47
				
				
				
britllm/piqa_scottish_gaelic
	
				Updated
					
				
	
				• 
					
					3
				
				
				
britllm/piqa_welsh
	
				Updated
					
				
	
				• 
					
					16
				
				
				
britllm/piqa_irish
	
				Updated
					
				
	
				• 
					
					2
				
				
				
britllm/arc_scottish_gaelic
			Viewer
			• 
	
				Updated
					
				• 
			
			7.56k
	
				• 
					
					58