CrossEncoder based on mixedbread-ai/mxbai-rerank-large-v2

This is a Cross Encoder model finetuned from mixedbread-ai/mxbai-rerank-large-v2 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Type: Cross Encoder
Base model: mixedbread-ai/mxbai-rerank-large-v2
Maximum Sequence Length: 32768 tokens
Number of Output Labels: 1 label

Model Sources

Documentation: Sentence Transformers Documentation
Documentation: Cross Encoder Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Cross Encoders on Hugging Face

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("cross-encoder-testing/mxbai-rerank-large-v2-v6")
# Get scores for pairs of texts
pairs = [
    ['How many calories in an egg', 'There are on average between 55 and 80 calories in an egg depending on its size.'],
    ['How many calories in an egg', 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.'],
    ['How many calories in an egg', 'Most of the calories in an egg come from the yellow yolk in the center.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (3,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'How many calories in an egg',
    [
        'There are on average between 55 and 80 calories in an egg depending on its size.',
        'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.',
        'Most of the calories in an egg come from the yellow yolk in the center.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Training Details

Framework Versions

Python: 3.11.6
Sentence Transformers: 5.3.0.dev0
Transformers: 4.57.3
PyTorch: 2.9.1+cu126
Accelerate: 1.6.0
Datasets: 4.2.0
Tokenizers: 0.22.1

Citation

BibTeX

Downloads last month: 13

Safetensors

Model size

2B params

Tensor type

BF16

Inference Providers NEW

Text Ranking

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cross-encoder-testing/mxbai-rerank-large-v2-v6

Base model

mixedbread-ai/mxbai-rerank-large-v2

Finetuned

(2)

this model