AI & ML interests
finding your community
 
					
	
		
nroggendorffΒ 
				
	
			
		
					posted 
						an
							update
							
				
				about 22 hours ago
 
					
	
		
Reality123bΒ 
				
	
			
		
					posted 
						an
							update
							
				
				5 days ago
Post
				
				
							242
					made a new blog about releasing my new AI (hope i dont get banned for making an article for my own model)  https://huggingface.co/blog/Reality123b/introducing-xylaria-2-exempted
		
	
		
	 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				5 days ago
Post
				
				
							2101
					Letβs have the comparison again with Multimodal OCR3:
nanonets/Nanonets-OCR2-3B vs allenai/olmOCR-2-7B-1025 vs rednote-hilab/dots.ocr vs datalab-to/chandra
Try it here @ prithivMLmods/Multimodal-OCR3
Collection: prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
	
		
	nanonets/Nanonets-OCR2-3B vs allenai/olmOCR-2-7B-1025 vs rednote-hilab/dots.ocr vs datalab-to/chandra
Try it here @ prithivMLmods/Multimodal-OCR3
Collection: prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
 
					
	
		
nroggendorffΒ 
				
	
			
		
					posted 
						an
							update
							
				
				8 days ago
Post
				
				
							3307
					I love getting emails telling me when there's somebody else's active access token in one of my commit SHAs. HF should really only tell you if it is your token, otherwise I could just make a dataset with a bunch of random strings and wait for a valid token.
Also, don't comment about how unlikely this is. I've gotten a warning email about a token I 'leaked' at least four times.
In all cases, it has been in the digest hash.
	
		
	user,permission,token
nroggendorff,write,hf_...
pepper13,finegrained,hf_...
...,...,...
...Also, don't comment about how unlikely this is. I've gotten a warning email about a token I 'leaked' at least four times.
In all cases, it has been in the digest hash.
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				10 days ago
Post
				
				
							1856
					Now you can try all the latest state-of-the-art multimodal vision-language models from the Qwen3-VL series demo on Hugging Face Spaces β including 4B, 8B, and 30B (Instruct, 4B-Thinking) variants. Iβve also uploaded the weights for the Abliterated variants of these models, up to 30B parameters. Check out the Spaces and model links below! π€π₯
β¨ Qwen3-VL[4B,8B]: prithivMLmods/Qwen3-VL-Outpost
β¨ Qwen3-VL-30B-A3B-Demo: prithivMLmods/Qwen3-VL-HF-Demo
β¨ Collection: prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
Qwen3-VL Abliterated Model Collection [ Version 1.0 ]
β¨ Qwen3-VL-8B-Instruct-abliterated: prithivMLmods/Qwen3-VL-8B-Instruct-abliterated
β¨ Qwen3-VL-4B-Instruct-abliterated: prithivMLmods/Qwen3-VL-4B-Instruct-abliterated
β¨ Qwen3-VL-8B-Thinking-abliterated: prithivMLmods/Qwen3-VL-8B-Thinking-abliterated
β¨ Qwen3-VL-4B-Thinking-abliterated: prithivMLmods/Qwen3-VL-4B-Thinking-abliterated
β¨ Qwen3-VL-30B-A3B-Instruct-abliterated: prithivMLmods/Qwen3-VL-30B-A3B-Instruct-abliterated
β¨ Qwen3-VL-30B-A3B-Thinking-abliterated: prithivMLmods/Qwen3-VL-30B-A3B-Thinking-abliterated
β‘Collection: prithivMLmods/qwen3-vl-abliteration-oct-1625-68f0e3e567ef076594605fac
Note: This is version 1.0 of the Abliteration of the Qwen3-VL series of models. It may perform sub-optimally in some cases. If you encounter any issues, please open a discussion.
	
		
	β¨ Qwen3-VL[4B,8B]: prithivMLmods/Qwen3-VL-Outpost
β¨ Qwen3-VL-30B-A3B-Demo: prithivMLmods/Qwen3-VL-HF-Demo
β¨ Collection: prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
Qwen3-VL Abliterated Model Collection [ Version 1.0 ]
β¨ Qwen3-VL-8B-Instruct-abliterated: prithivMLmods/Qwen3-VL-8B-Instruct-abliterated
β¨ Qwen3-VL-4B-Instruct-abliterated: prithivMLmods/Qwen3-VL-4B-Instruct-abliterated
β¨ Qwen3-VL-8B-Thinking-abliterated: prithivMLmods/Qwen3-VL-8B-Thinking-abliterated
β¨ Qwen3-VL-4B-Thinking-abliterated: prithivMLmods/Qwen3-VL-4B-Thinking-abliterated
β¨ Qwen3-VL-30B-A3B-Instruct-abliterated: prithivMLmods/Qwen3-VL-30B-A3B-Instruct-abliterated
β¨ Qwen3-VL-30B-A3B-Thinking-abliterated: prithivMLmods/Qwen3-VL-30B-A3B-Thinking-abliterated
β‘Collection: prithivMLmods/qwen3-vl-abliteration-oct-1625-68f0e3e567ef076594605fac
Note: This is version 1.0 of the Abliteration of the Qwen3-VL series of models. It may perform sub-optimally in some cases. If you encounter any issues, please open a discussion.
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				12 days ago
Post
				
				
							3036
					Introducing Image-Guard-2.0, an experimental, lightweight vision-language encoder model with a size of 0.1B (<100M parameters), trained on SigLIP2 (siglip2-base-patch16-224). Designed for multi-label image classification tasks, this model functions as an image safety system, serving as an image guard or moderator across a wide range of categories, from anime to realistic imagery.
β‘blog-article: https://huggingface.co/blog/prithivMLmods/image-guard-models
It also performs strict moderation and filtering of artificially synthesized content, demonstrating strong detection and handling of explicit images. Image-Guard-2.0 delivers robust performance in streamlined scenarios, ensuring reliable and effective classification across diverse visual inputs.
	
		
	β‘blog-article: https://huggingface.co/blog/prithivMLmods/image-guard-models
It also performs strict moderation and filtering of artificially synthesized content, demonstrating strong detection and handling of explicit images. Image-Guard-2.0 delivers robust performance in streamlined scenarios, ensuring reliable and effective classification across diverse visual inputs.
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				15 days ago
Post
				
				
							3301
					The demo of Qwen3-VL-30B-A3B-Instruct, the next-generation and powerful vision-language model in the Qwen series, delivers comprehensive upgrades across the board β including superior text understanding and generation, deeper visual perception and reasoning, extended context length, enhanced spatial and video dynamics comprehension, and stronger agent interaction capabilities. π€π₯
β‘ Space / App: prithivMLmods/Qwen3-VL-HF-Demo
The modelβs demo supports a wide range of tasks, including;
Image Inference, Video Inference, PDF Inference, Image Captioning (VLA), GIF Inference.
β‘ Collection: prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
Thanks for granting the blazing-fast Zero GPU access, @merve π
β‘ Other Pages
> Github: https://github.com/prithivsakthiur/qwen3-vl-hf-demo
> Multimodal VLMs July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027
> VL caption β < Sep 15 β25 : prithivMLmods/vl-caption-sep-15-25-68c7f6d737985c63c13e2391
> Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd
To know more about it, visit the app page or the respective model page!!
	
		
	β‘ Space / App: prithivMLmods/Qwen3-VL-HF-Demo
The modelβs demo supports a wide range of tasks, including;
Image Inference, Video Inference, PDF Inference, Image Captioning (VLA), GIF Inference.
β‘ Collection: prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
Thanks for granting the blazing-fast Zero GPU access, @merve π
β‘ Other Pages
> Github: https://github.com/prithivsakthiur/qwen3-vl-hf-demo
> Multimodal VLMs July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027
> VL caption β < Sep 15 β25 : prithivMLmods/vl-caption-sep-15-25-68c7f6d737985c63c13e2391
> Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd
To know more about it, visit the app page or the respective model page!!
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				18 days ago
Post
				
				
							448
					Introducing the next-gen version of DeepCaption-VLA (v2.0) β an advanced, multimodal model based on Qwen2.5-VL, specialized for Image Captioning and Vision Language Attribution (VLA). This enhanced release focuses on generating precise, attribute-rich captions that capture visual properties, object attributes, and scene details across diverse image types and aspect ratios. Version 2.0 introduces significant improvements in multilingual inference, delivering higher captioning quality and attribution accuracy in languages including Chinese (Zh), Thai (Th), and more. 
π€ DeepCaption-VLA (v2.0) : prithivMLmods/DeepCaption-VLA-V2.0-7B
π«± Collection : prithivMLmods/vlm-20-oct-0825-68e606aa6e3993be8a3b1d51
β GitHub (notebook) : https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/DeepCaption_VLA_V2_0_7B/DeepCaption_VLA_V2_0_7Bipynb.ipynb
Other Pagesβ‘
β₯ Multimodal VLMs July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027
β₯ VL caption β < Sep 15 β25 : prithivMLmods/vl-caption-sep-15-25-68c7f6d737985c63c13e2391
β₯ Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd
To know more about it, visit the app page or the respective model page!!
	
		
	π€ DeepCaption-VLA (v2.0) : prithivMLmods/DeepCaption-VLA-V2.0-7B
π«± Collection : prithivMLmods/vlm-20-oct-0825-68e606aa6e3993be8a3b1d51
β GitHub (notebook) : https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/DeepCaption_VLA_V2_0_7B/DeepCaption_VLA_V2_0_7Bipynb.ipynb
Other Pagesβ‘
β₯ Multimodal VLMs July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027
β₯ VL caption β < Sep 15 β25 : prithivMLmods/vl-caption-sep-15-25-68c7f6d737985c63c13e2391
β₯ Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd
To know more about it, visit the app page or the respective model page!!
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				19 days ago
Post
				
				
							2775
					Have built the new Image Studio with the Gemini Image Gen models for the following multiple tasks: 
β Gemini-Image-Studio: prithivMLmods/Gemini-Image-Studio (Latest)
π€ Old-App: prithivMLmods/Nano-Banana-AIO
π₯ GitHub: https://github.com/prithivsakthiur/gemini-image-studio-hf
To proceed, you need to add your Gemini API key. Your API key is stored only for the duration of your session and will be lost when you reload or exit the page. It will not be shared or exposed anywhere.
	
		
	imagen-4.0-fast-generate-001 model for Image Generation (Text-to-Image) and Multi-Image Editing (Image-to-Image), and Draw-to-Image powered by gemini-2.5-flash-image (aka Nano Banana).β Gemini-Image-Studio: prithivMLmods/Gemini-Image-Studio (Latest)
π€ Old-App: prithivMLmods/Nano-Banana-AIO
π₯ GitHub: https://github.com/prithivsakthiur/gemini-image-studio-hf
To proceed, you need to add your Gemini API key. Your API key is stored only for the duration of your session and will be lost when you reload or exit the page. It will not be shared or exposed anywhere.
Post
				
				
							2239
					
@lunarflu
	 can you make me out from Hugging Face Discord Community? Because my old email discord was gone yet and all of my email gone too π
My Dead Account:
@Blane187
@Ryouko65777
Also delete the account if can π
	
		
	My Dead Account:
@Blane187
@Ryouko65777
Also delete the account if can π
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				23 days ago
Post
				
				
							4522
					Try the Hugging Face Space demo for 
	Logics-MLLM/Logics-Parsing, the latest multimodal VLM from the Logics Team at Alibaba Group. It enables end-to-end document parsing with precise content extraction in markdown format, and it also generates a clean HTML representation of the document while preserving its logical structure. π€π₯
Additionally, Iβve integrated one of my recent works β prithivMLmods/Gliese-OCR-7B-Post1.0 β which also excels at document comprehension.
β Space / App : prithivMLmods/VLM-Parsing
π Technical Report by the Logics Team, Alibaba Group : Logics-Parsing Technical Report (2509.19760)
π MM: VLM-Parsing: prithivMLmods/mm-vlm-parsing-68e33e52bfb9ae60b50602dc
β‘ Collections : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
Other Pages:
β Multimodal VLMs - July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027
β Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd
β VL caption β < Sep 15 β25 : prithivMLmods/vl-caption-sep-15-25-68c7f6d737985c63c13e2391
.
.
.
To know more about it, visit the app page or the respective model page!!
	
		
	Additionally, Iβve integrated one of my recent works β prithivMLmods/Gliese-OCR-7B-Post1.0 β which also excels at document comprehension.
β Space / App : prithivMLmods/VLM-Parsing
π Technical Report by the Logics Team, Alibaba Group : Logics-Parsing Technical Report (2509.19760)
π MM: VLM-Parsing: prithivMLmods/mm-vlm-parsing-68e33e52bfb9ae60b50602dc
β‘ Collections : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
Other Pages:
β Multimodal VLMs - July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027
β Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd
β VL caption β < Sep 15 β25 : prithivMLmods/vl-caption-sep-15-25-68c7f6d737985c63c13e2391
.
.
.
To know more about it, visit the app page or the respective model page!!
 
					
	
		
nroggendorffΒ 
				
	
			
		
					posted 
						an
							update
							
				
				25 days ago
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				27 days ago
Post
				
				
							1188
					Try Banana Zoom an advanced image enhancement web app that lets users select regions of an image for AI-powered upscaling and detail refinement. Using Googleβs (nano banana), it analyzes selections, generates context-aware enhancements, and produces high-resolution outputs. Simply drag-and-drop or upload images, make precise or fixed-size selections, and watch improvements in real-time with smooth zoom and pixel-dissolve effects. 
Space / App: prithivMLmods/Banana-Zoom
Collection: https://huggingface.co/collections/prithivMLmods/image-gen-apps-diffusion-lastupdated-09-23-68a2f4c5ef3e5e394eacc20a
GitHub: https://github.com/prithivsakthiur/banana-zoom
Your API will be automatically destroyed once you refresh the app or exit it, so each user's API will be cycled in this way.
	
		
	Space / App: prithivMLmods/Banana-Zoom
Collection: https://huggingface.co/collections/prithivMLmods/image-gen-apps-diffusion-lastupdated-09-23-68a2f4c5ef3e5e394eacc20a
GitHub: https://github.com/prithivsakthiur/banana-zoom
Your API will be automatically destroyed once you refresh the app or exit it, so each user's API will be cycled in this way.
 
					
	
		
Reality123bΒ 
				
	
			
		
					posted 
						an
							update
							
				
				about 1 month ago
 
					
	
		
nroggendorffΒ 
				
	
			
		
					posted 
						an
							update
							
				
				about 1 month ago
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				about 1 month ago
Post
				
				
							4400
					Photo-Mate-i2i β a space for experimenting with adapters for image manipulation using Kontext adapters, including Photo-Restore-i2i, PhotoCleanser-i2i, Polaroid-Warm-i2i, Yarn-Photo-i2i, Monochrome-Pencil, and more. Try out the demo, and to learn more, visit the app page or the respective model pages!
β‘Demo: prithivMLmods/Photo-Mate-i2i
βοΈHow to Use: prithivMLmods/Photo-Mate-i2i#2
π¨βπ§i2i-Kontext(Experimental LoRAs): prithivMLmods/i2i-kontext-exp-68ce573b5c0623476b636ec7
		
	
		
	β‘Demo: prithivMLmods/Photo-Mate-i2i
βοΈHow to Use: prithivMLmods/Photo-Mate-i2i#2
π¨βπ§i2i-Kontext(Experimental LoRAs): prithivMLmods/i2i-kontext-exp-68ce573b5c0623476b636ec7
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				about 1 month ago
Post
				
				
							5217
					Dropping some experimental adapters for FLUX.1-Kontext-dev, including Photo-Restore-i2i, PhotoCleanser-i2i, Polaroid-Warm-i2i, Yarn-Photo-i2i, and Monochrome-Pencil. These were trained under various settings with minimal image pairs to achieve optimal results. The dataset result sets end pairs were synthesized using Gemini-2.5-Flash-Image-Preview and others.π€β¨
prithivMLmods/PhotoCleanser-i2i: Remove objects while preserving the rest of the image.
prithivMLmods/Photo-Restore-i2i: Restore old photos into moderately colorized, detailed images.
prithivMLmods/Polaroid-Warm-i2i: Seamless vintage Polaroid-style images with warm, faded tones.
prithivMLmods/Yarn-Photo-i2i: Convert images into yarn-stitched artwork while retaining key details.
prithivMLmods/Monochrome-Pencil: Turn images into monochrome pencil sketches while keeping original features.
β¨Note: All the above models share the same auto-labeling multimodal VLM captioning model, prithivMLmods/DeepCaption-VLA-7B, which is used for refining edit instructions and accurately understanding attributions for the generations.
β¨Collection: prithivMLmods/i2i-kontext-exp-68ce573b5c0623476b636ec7
.
.
.
To know more about it, visit the app page or the respective model page!!
	
		
	prithivMLmods/PhotoCleanser-i2i: Remove objects while preserving the rest of the image.
prithivMLmods/Photo-Restore-i2i: Restore old photos into moderately colorized, detailed images.
prithivMLmods/Polaroid-Warm-i2i: Seamless vintage Polaroid-style images with warm, faded tones.
prithivMLmods/Yarn-Photo-i2i: Convert images into yarn-stitched artwork while retaining key details.
prithivMLmods/Monochrome-Pencil: Turn images into monochrome pencil sketches while keeping original features.
β¨Note: All the above models share the same auto-labeling multimodal VLM captioning model, prithivMLmods/DeepCaption-VLA-7B, which is used for refining edit instructions and accurately understanding attributions for the generations.
β¨Collection: prithivMLmods/i2i-kontext-exp-68ce573b5c0623476b636ec7
.
.
.
To know more about it, visit the app page or the respective model page!!
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				about 1 month ago
Post
				
				
							1591
					Many of 'em pinged me asking to make the 
β¦ Yes, it is now available on Spaces: prithivMLmods/Nano-Banana-AIO
Nano Banana AIO (All-in-One) App, which offers seamless image manipulation features, including single/multiple image adaptation, a canvas for free-style drawing to creative image generation, and standard text-to-image generation.
All in One Banana for you! π
	
		
	nano-banana-aio to  available on hf.co/spaces, so Iβve transferred the appβs tech stack to make it compatible for deployment on Spaces. (Can be accessed with your own Gemini API)  π€βοΈβ¦ Yes, it is now available on Spaces: prithivMLmods/Nano-Banana-AIO
Nano Banana AIO (All-in-One) App, which offers seamless image manipulation features, including single/multiple image adaptation, a canvas for free-style drawing to creative image generation, and standard text-to-image generation.
All in One Banana for you! π
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				about 1 month ago
Post
				
				
							3102
					I'm a Hugging Face Fellow now, guys!π€β€οΈ
With the same passion, trust, and momentum to contribute to the community, Iβm excited to do some amazing things to wrap up Q3 and Q4 of 2025. And importantly, Iβve been lucky enough to receive some knowledge and guidance from @merve to build open-source demos and stuff. Thank you for the belief.
Thank you β much love.
Long live open source!
β Prithiv
	
		
	With the same passion, trust, and momentum to contribute to the community, Iβm excited to do some amazing things to wrap up Q3 and Q4 of 2025. And importantly, Iβve been lucky enough to receive some knowledge and guidance from @merve to build open-source demos and stuff. Thank you for the belief.
Thank you β much love.
Long live open source!
β Prithiv
 
					
	
		
prithivMLmodsΒ 
				
	
			
		
					posted 
						an
							update
							
				
				about 1 month ago
Post
				
				
							7191
					Introducing Gliese-OCR-7B-Post1.0, a document content-structure retrieval VLM designed for content extraction(OCRs) and summarization. This is the third model in the Camel Doc OCR VLM series, following Camel-Doc-OCR-062825. The new version fixes formal table reconstruction issues in both En and Zh, achieving optimal performance for long-context inferences. This model also shows significant improvements in LaTeX and Markdown rendering for OCR tasks.
π€ Gliese-OCR-7B-Post1.0 : prithivMLmods/Gliese-OCR-7B-Post1.0
π Gliese-Post1.0 Collection : prithivMLmods/gliese-post10-68c52c4a6ca4935f5259a6d7
β¬ οΈ Previous Versions : prithivMLmods/Camel-Doc-OCR-062825
𧨠Gliese-OCR-7B-Post1.0 (4-bit) Notebook Demo on T4 : prithivMLmods/Gliese-OCR-7B-Post1.0
π GitHub [Gliese-OCR-7B-Post1.0(4-bit)-reportlab] : https://tinyurl.com/ys7zuerc
Other Collections:
β Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
β Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd
β Multimodal VLMs - July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027
.
.
.
To know more about it, visit the app page or the respective model page!!
	
		
	π€ Gliese-OCR-7B-Post1.0 : prithivMLmods/Gliese-OCR-7B-Post1.0
π Gliese-Post1.0 Collection : prithivMLmods/gliese-post10-68c52c4a6ca4935f5259a6d7
β¬ οΈ Previous Versions : prithivMLmods/Camel-Doc-OCR-062825
𧨠Gliese-OCR-7B-Post1.0 (4-bit) Notebook Demo on T4 : prithivMLmods/Gliese-OCR-7B-Post1.0
π GitHub [Gliese-OCR-7B-Post1.0(4-bit)-reportlab] : https://tinyurl.com/ys7zuerc
Other Collections:
β Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
β Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd
β Multimodal VLMs - July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027
.
.
.
To know more about it, visit the app page or the respective model page!!

 
					 
					 
					 
					 
					 
					 
					