--- license: apache-2.0 tags: - vision - image-segmentation - hierarchical - encoder - sam2 - meta - pytorch - huggingface-compatible --- # Hiera Encoder from Meta's SAM2.1 (Segment Anything Model) Meta's [SAM2 (Segment Anything Model v2)](https://github.com/facebookresearch/sam2) demonstrates state-of-the-art video segmentation capabilities. A core component enabling this is the **Hiera** module, which, through supervised training on object segmentation, has learned a strong understanding of hierarchical visual features. While Meta has released the full SAM2 models and their weights, these releases are based on **PyTorch** code and **not integrated with Hugging Face Transformers** or common training frameworks such as `Trainer`, `DeepSpeed`, etc. This repository extracts the **Hiera** module from SAM2 and **wraps it with Hugging Face compatibility**, including integration with `PretrainedConfig`, `PreTrainedModel`, etc., allowing seamless use in Hugging Face-style training and inference workflows. --- ## Model Details - **Original Model**: [facebook/sam2.1-hiera-base-plus](https://huggingface.co/facebook/sam2.1-hiera-base-plus) - **This Model**: [`nkkbr/hiera-base-plus-in-sam2.1`](https://huggingface.co/nkkbr/hiera-base-plus-in-sam2.1) This model exposes only the **Hiera encoder** extracted from SAM2.1, wrapped for Hugging Face usage. --- ## Installation You first need to install Meta’s original SAM2 code: ```bash git clone https://github.com/facebookresearch/sam2.git && cd sam2 pip install -e . ``` --- ## Usage ```python from hiera_encoder import HieraVisionModel # Load the Hiera module from Hugging Face model = HieraVisionModel.from_pretrained("nkkbr/hiera-base-plus-in-sam2.1") # Get the raw Hiera model model = model.hiera # Print model parameters for name, param in model.named_parameters(): print(f"{name:50} {param.shape}") ``` --- ## Weight Consistency Check To verify that the weights are identical to those in Meta's original SAM2.1 Hiera module: ```python import torch from sam2.sam2_image_predictor import SAM2ImagePredictor # Load SAM2.1 predictor from Meta's official release predictor = SAM2ImagePredictor.from_pretrained("facebook/sam2.1-hiera-base-plus") hiera_model_in_predictor = predictor.model.image_encoder.trunk # Compare weights for name, param in model.named_parameters(): if not torch.equal(param, hiera_model_in_predictor.state_dict()[name]): print(f"The parameter {name} has different weights in the two models.") print("Comparison complete!") ``` --- ## License Please refer to the [SAM2 repository](https://github.com/facebookresearch/sam2) for license and usage terms.