---
tags:
- vision
- dinov2
- hematology
- cytomorphology
- foundation-model
license: apache-2.0
citation: |
@inproceedings{koch2024dinobloom,
title={DinoBloom: a foundation model for generalizable cell embeddings in hematology},
author={Koch, Valentin and Wagner, Sophia J and Kazeminia, Salome and Sancar, Ece and Hehr, Matthias and Schnabel, Julia A and Peng, Tingying and Marr, Carsten},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
pages={520--530},
year={2024},
organization={Springer}
}
---
# DinoBloom: A Foundation Model for Generalizable Cell Embeddings in Hematology
**DinoBloom** builds upon [DINOv2](https://arxiv.org/abs/2304.07193) (Meta AI) and is trained on **13 diverse publicly available datasets** of single cells from peripheral blood and bone marrow.
📄 Paper •
💻 GitHub •
📦 Zenodo
---
## 🧠Model Variants
DinoBloom is available in **four sizes**:
| Model | Feature Dim | Parameters | Checkpoint |
|-------|-------------|------------|------------|
| **DinoBloom-S** | 384 | 22M | `pytorch_model_s.bin` |
| **DinoBloom-B** | 768 | 86M | `pytorch_model_b.bin` |
| **DinoBloom-L** | 1024 | 304M | `pytorch_model_l.bin` |
| **DinoBloom-G** | 1536 | 1136M | `pytorch_model_g.bin` |
---
## 🚀 Usage
```python
from huggingface_hub import hf_hub_download
import torch
import torch.nn as nn
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Choose variant: "s", "b", "l", or "g"
variant = "b"
# Configuration
variant_config = {
"s": ("dinov2_vits14", 384),
"b": ("dinov2_vitb14", 768),
"l": ("dinov2_vitl14", 1024),
"g": ("dinov2_vitg14", 1536),
}
dinov2_model, embed_dim = variant_config[variant]
# Load base DINOv2 model
model = torch.hub.load("facebookresearch/dinov2", dinov2_model)
# Download DinoBloom weights
ckpt_path = hf_hub_download(
repo_id="MarrLab/DinoBloom",
filename=f"pytorch_model_{variant}.bin"
)
ckpt = torch.load(ckpt_path, map_location="cpu")
num_tokens = int(1 + (224 / 14) ** 2)
model.pos_embed = nn.Parameter(torch.zeros(1, num_tokens, embed_dim))
model.load_state_dict(ckpt, strict=True)
model.to(device)
model.eval()
# Get transforms
from torchvision import transforms
transform = transforms.Compose([
transforms.Resize((224,224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# Apply to image
from PIL import Image
img = Image.open("path/to/cell_image")
img_tensor = transform(img).unsqueeze(0).to(device)
# Get features
with torch.no_grad():
features = model(img_tensor)
print(f"Features shape: {features.shape}") # [1, 768] for DinoBloom-B
```
---
## 📊 Model Performance
DinoBloom outperforms existing medical and non-medical vision models in:
1. **Linear probing** and **k-nearest neighbor** evaluations for cell-type classification
2. **Weakly supervised multiple-instance learning (MIL)** for acute myeloid leukemia subtyping
See our [paper](https://arxiv.org/abs/2404.05022) for detailed benchmarks.
---
## 🔧 Requirements
```bash
pip install torch torchvision huggingface_hub
```
---
## 📚 Citation
If you use DinoBloom in your research, please cite:
```bibtex
@inproceedings{koch2024dinobloom,
title={DinoBloom: a foundation model for generalizable cell embeddings in hematology},
author={Koch, Valentin and Wagner, Sophia J and Kazeminia, Salome and Sancar, Ece and Hehr, Matthias and Schnabel, Julia A and Peng, Tingying and Marr, Carsten},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
pages={520--530},
year={2024},
organization={Springer}
}
```
---
## 📖 Related Work
DinoBloom builds upon:
- [DINOv2](https://arxiv.org/abs/2304.07193) - Self-supervised vision transformers
- [Original DinoBloom Paper](https://arxiv.org/abs/2404.05022) - MICCAI 2024
---
## 📄 License
Apache 2.0 - See [LICENSE](LICENSE) file for details.
---
For questions or issues, please open an issue on [GitHub](https://github.com/MarrLab/DinoBloom) or contact the authors.