AI & ML interests

Computer Vision

Recent Activity

prithivMLmodsΒ 
posted an update 3 days ago
view post
Post
3742
LTX-2 Camera-Control LoRA demo with dolly-in/out and dolly-left/right is now available on Hugging Face, paired with ltx-2-19b-distilled-lora for fast inference. It also includes dynamic GPU duration adjustments for long video generations. Click the related Space links below.

πŸ€—Try it now on : prithivMLmods/LTX-2-LoRAs-Camera-Control-Dolly
⭐Github: https://github.com/PRITHIVSAKTHIUR/LTX-2-LoRAs-Camera-Control-Dolly
πŸ•ΉοΈCollection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection

To learn more, visit the app page or the respective model pages.
  • 2 replies
Β·
NymboΒ 
posted an update 6 days ago
view post
Post
1642
Genuine recommendation: You should really use this AutoHotKey macro. Save the file as macros.ahk and run it. Before sending a prompt to your coding agent, press Ctrl + Alt + 1 and paste your prompt to any regular chatbot. Then send the output to the agent. This is the actual, boring, real way to "10x your prompting". Use the other number keys to avoid repeating yourself over and over again. I use this macro prolly 100-200 times per day. AutoHotKey isn't as new or hype as a lot of other workflows, but there's a reason it's still widely used after 17 years. Don't overcomplicate it.

; Requires AutoHotkey v1.1+

; All macros are `Ctrl + Alt + <variable>`

^!1::
    Send, Please help me more clearly articulate what I mean with this message (write the message in a code block):
return

^!2::
    Send, Please make the following changes:
return

^!3::
    Send, It seems you got cut off by the maximum response limit. Please continue by picking up where you left off.
return


In my experience the past few months, Ctrl + Alt + 1 works best with Instruct models (non-thinking). Reasoning causes some models to ramble and miss the point. I've just been using GPT-5.x for this.
prithivMLmodsΒ 
posted an update 10 days ago
view post
Post
2381
Dropping Image Edit (Object Manipulator): Add or remove specified objects/designs, with flexible support for both single-image and multi-image modes.

πŸ€— Demo: prithivMLmods/Qwen-Image-Edit-Object-Manipulator

Qwen-Image-Edit-2511-Object-Remover is an adapter (LoRA) developed for Qwen’s Qwen-Image-Edit-2511 image-to-image model. It is specifically designed for precise object removal from images.

⭐ Model: prithivMLmods/Qwen-Image-Edit-2511-Object-Remover

Qwen-Image-Edit-2511-Object-Adder is an adapter (LoRA) developed for Qwen’s Qwen-Image-Edit-2511 image-to-image model. It is specifically designed for precise object addition to images.

⭐ Model: prithivMLmods/Qwen-Image-Edit-2511-Object-Adder

πŸ•ΉοΈ Collection: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-object-manipulator
πŸ•ΉοΈ github: https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-Object-Manipulator

To learn more, visit the app page or the respective model pages.
prithivMLmodsΒ 
posted an update 17 days ago
view post
Post
4116
Update: TRELLIS.2 (Text to 3D, Image to 3D) Gradio with Rerun Embedded demo with improved visualization of the 3D model previewer is now available on Hugging Face. Generate assets and view them in the 3D viewer, powered and streamlined with Microsoft’s TRELLIS.2 and Tongyi-MAI’s Z-Image-Turbo models.

πŸ€— TRELLIS.2 (Demo): prithivMLmods/TRELLIS.2-Text-to-3D
πŸ•ΉοΈ GitHub: https://github.com/PRITHIVSAKTHIUR/TRELLIS.2-Text-to-3D-RERUN
πŸ•ΉοΈ Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations

To know more about it, visit the app page or the respective model page!
prithivMLmodsΒ 
posted an update 18 days ago
view post
Post
4200
Introducing the Qwen-Image-Edit-2511-LoRAs-Fast demo, featuring image property comparison and contrast, built on top of Gradio and the combined Rerun SDK. It supports single and multi-image edits with existing LoRAs that are lazily loaded. (Note: This is still an experimental Space for Qwen-Image-Edit-2511.)

⭐ Space Demo: prithivMLmods/Qwen-Image-Edit-2511-LoRAs-Fast
⭐ GitHub: https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-2511-LoRAs-Fast-Multi-Image-Rerun
⭐ Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection

To know more about it, visit the app page or the respective model page!
  • 2 replies
Β·
prithivMLmodsΒ 
posted an update 25 days ago
view post
Post
3703
Introducing demos for new SOTA models from AI2: SAGE-MM (Smart Any-Horizon Agents for Long-Video Reasoning) and Molmo-2, an open vision-language model that supports multi-image (QA and pointing) and video (QA, pointing, and tracking). The respective demo-related collections are listed below. πŸŽƒπŸ”₯

✨ SAGE-MM [Video-Reasoning]: prithivMLmods/SAGE-MM-Video-Reasoning
✨ Molmo2 [Demo]: prithivMLmods/Molmo2-HF-Demo

πŸŽƒ GitHub[SAGE-MM]: https://github.com/PRITHIVSAKTHIUR/SAGE-MM-Video-Reasoning
πŸŽƒ GitHub[Molmo2]: https://github.com/PRITHIVSAKTHIUR/Molmo2-HF-Demo
πŸŽƒ Multimodal Implementations: https://huggingface.co/collections/prithivMLmods/multimodal-implementations

To know more about it, visit the app page or the respective model page!
  • 1 reply
Β·
vansinΒ 
posted an update 25 days ago
view post
Post
263
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management
prithivMLmodsΒ 
posted an update 26 days ago
view post
Post
2065
Introducing TRELLIS.2 Text-to-3D. The demo for the TRELLIS.2-4B (Image-to-3D) model is streamlined with the Z-Image Turbo image generation model to enable Text-to-3D functionality. There is no need for input assets, making a small leap forward for ideation. Optionally, it also includes default support for Image-to-3D inference using direct image assets. Find the demo and related collections below... πŸ€—πŸ”₯

✨ TRELLIS.2-Text-to-3D [Demo]: prithivMLmods/TRELLIS.2-Text-to-3D
✨ Multimodal Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
✨ Github: https://github.com/PRITHIVSAKTHIUR/TRELLIS.2-Text-to-3D

To know more about it, visit the app page or the respective model page!
NymboΒ 
posted an update 26 days ago
view post
Post
2074
🚨 New tool for the Nymbo/Tools MCP server: The new Agent_Skills tool provides full support for Agent Skills (Claude Skills but open-source).

How it works: The tool exposes the standard discover/info/resources/validate actions. Skills live in /Skills under the same File_System root, and any bundled scripts run through Shell_Command, no new infrastructure required.

Agent_Skills(action="discover")  # List all available skills
Agent_Skills(action="info", skill_name="music-downloader")  # Full SKILL.md
Agent_Skills(action="resources", skill_name="music-downloader")  # Scripts, refs, assets


I've included a music-downloader skill as a working demo, it wraps yt-dlp for YouTube/SoundCloud audio extraction.

Caveat: On HF Spaces, Shell_Command works for most tasks, but some operations (like YouTube downloads) are restricted due to the container environment. For full functionality, run the server locally on your machine.

Try it out ~ https://www.nymbo.net/nymbot
prithivMLmodsΒ 
posted an update 28 days ago
view post
Post
2025
Demo for Molmo2 on Hugging Face is live now, including Single/Multi-Image VQA, Visual Pointing/Grounding, Video VQA, and Video Point Tracking. Find the demo and related collections below. πŸ”₯πŸ€—

● Molmo2 HF DemoπŸ–₯️: prithivMLmods/Molmo2-HF-Demo
● Model Collection: https://huggingface.co/collections/allenai/molmo2
● Related Multimodal Space Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations

To know more about it, visit the app page or the respective model page!
prithivMLmodsΒ 
posted an update 29 days ago
view post
Post
5551
Introducing the Z Image Turbo LoRA DLC App, a gallery space for plug-and-play Z-Image-Turbo LoRAs. It features a curated collection of impressive LoRAs for generating high-quality images. By default, it runs on the base model. Simply choose a LoRA, type your prompt, and generate images. You can find the app and more details below. πŸ€—πŸ§ͺ

● Space [Demo]: prithivMLmods/Z-Image-Turbo-LoRA-DLC
● Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
● Check the list of Z-Image LoRA's: https://huggingface.co/models?other=base_model:adapter:Tongyi-MAI/Z-Image-Turbo
● Github: https://github.com/PRITHIVSAKTHIUR/Z-Image-Turbo-LoRA-DLC

Other related image gen spaces:-

● FLUX-LoRA-DLC2: prithivMLmods/FLUX-LoRA-DLC2
● FLUX-LoRA-DLC: prithivMLmods/FLUX-LoRA-DLC
● Qwen-Image-LoRA-DLC: prithivMLmods/Qwen-Image-LoRA-DLC
● Qwen-Image-Edit-2509-LoRAs-Fast: prithivMLmods/Qwen-Image-Edit-2509-LoRAs-Fast
● Qwen-Image-Edit-2509-LoRAs-Fast-Fusion: prithivMLmods/Qwen-Image-Edit-2509-LoRAs-Fast-Fusion

& more...

To know more about it, visit the app page or the respective model page!
  • 2 replies
Β·