π€ Sentence Transformers is joining Hugging Face! π€ This formalizes the existing maintenance structure, as I've personally led the project for the past two years on behalf of Hugging Face! Details:
Today, the Ubiquitous Knowledge Processing (UKP) Lab is transferring the project to Hugging Face. Sentence Transformers will remain a community-driven, open-source project, with the same open-source license (Apache 2.0) as before. Contributions from researchers, developers, and enthusiasts are welcome and encouraged. The project will continue to prioritize transparency, collaboration, and broad accessibility.
We see an increasing wish from companies to move from large LLM APIs to local models for better control and privacy, reflected in the library's growth: in just the last 30 days, Sentence Transformer models have been downloaded >270 million times, second only to transformers.
I would like to thank the UKP Lab, and especially Nils Reimers and Iryna Gurevych, both for their dedication to the project and for their trust in myself, both now and two years ago. Back then, neither of you knew me well, yet you trusted me to take the project to new heights. That choice ended up being very valuable for the embedding & Information Retrieval community, and I think this choice of granting Hugging Face stewardship will be similarly successful.
I'm very excited about the future of the project, and for the world of embeddings and retrieval at large!
deepseek-ai/DeepSeek-OCR is out! π₯ my take β€΅οΈ > pretty insane it can parse and re-render charts in HTML > it uses CLIP and SAM features concatenated, so better grounding > very efficient per vision tokens/performance ratio > covers 100 languages
We just released the [BiRefNet_HR](ZhengPeng7/BiRefNet_HR) for general use on higher resolution images, which was trained with images in 2048x2048. If your images are mostly larger than 1024x1024, use BiRefNet_HR for better results! Thanks to @Freepik for the kind support of H200s for this huge training.
HF Model: ZhengPeng7/BiRefNet_HR. HF Demo: ZhengPeng7/BiRefNet_demo, where you need to choose General-HR and set high resolution. PyTorch weights & ONNX: in Google Drive and the GitHub release.
Here is a comparison between the results of the original one and the new HR one on HR inputs:
And, the performance of this new HR one and the previous one trained in 1024x1024 on val set: