ONNX Quantized versions of ibm-granite/granite-embedding-reranker-english-r2

This repository contains ONNX export and multiple quantized versions of ibm-granite/granite-embedding-reranker-english-r2.

Usage

from sentence_transformers import CrossEncoder

# Load Int8 model (ARM64 example)
model = CrossEncoder(
    "jrc2139/granite-embedding-reranker-english-r2-onnx",
    backend="onnx",
    model_kwargs={"file_name": "onnx/model_qint8_arm64.onnx"},
    trust_remote_code=True
)

scores = model.predict([("Query", "Document")])
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jrc2139/granite-embedding-reranker-english-r2-ONNX