CRAG-ModernBert

CRAG-ModernBert is a fine-tuned version of answerdotai/ModernBERT-base specifically trained for CRAG grading, a task that involves determining whether a given document is relevant or not relevant to a particular query or context. This is a binary sequence classification task. It achieves the following results on the evaluation set:

Loss: 0.2827
Accuracy: 0.8721
F1: 0.8723
Precision: 0.8728
Recall: 0.8721
Mcc: 0.7421

Model description

This model leverages the ModernBERT architecture to classify documents based on their relevance using data from the skshmjn/CRAG-EVAL dataset. It is trained to perform well on information retrieval and filtering tasks, particularly for applications where automated document triage or validation is required. CRAG-EVAL

Intended uses & limitations

CRAG Evaluation Pipelines: For automated assessment of whether retrieved documents match expected query criteria.

Information Retrieval Re-ranking: As a re-ranker model to filter relevant documents from top-N retrieved candidates.

Enterprise Search and Document Classification: Classifying internal documentation, compliance papers, or customer feedback for relevancy.

Training and evaluation data

The model was trained and evaluated using the CRAG-EVAL dataset, which contains documents labeled for relevance classification.

Training Split: 80% of the dataset
Validation Split: 10% of the dataset
Test Split: 10% of the dataset

The dataset includes pairs of queries(or question) and documents(or contexts), each labeled as either relevant (1) or not relevant (0). The model learns to classify the relevance based on these sentence pairs. To better understand the distribution of the similarity_score across relevance classes, a box plot was created comparing relevant and not relevant documents. This visualization helps illustrate the score patterns and outliers for each class, providing further insight into the data and model performance.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Precision	Recall	Mcc
0.2676	1.0	3294	0.2773	0.8758	0.8759	0.8762	0.8758	0.7492

Framework versions

Transformers 4.51.0.dev0
Pytorch 2.4.1+cu124
Datasets 3.5.0
Tokenizers 0.21.1

🧾 Example Usage (Query + Document)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "your-username/CRAG-ModernBert"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example input: context/query and document
query = "What are the effects of climate change on urban infrastructure?"
document = "This report outlines the critical impacts of climate change on water resources in urban areas."

# Tokenize as sentence pair
inputs = tokenizer(query, document, return_tensors="pt", truncation=True, padding=True)
with torch.no_grad():
    logits = model(**inputs).logits
    predicted_class = torch.argmax(logits, dim=1).item()

print("Relevant" if predicted_class == 1 else "Not Relevant")

Downloads last month: -

Safetensors

Model size

0.1B params

Tensor type

BF16

Model tree for skshmjn/CRAG-ModernBert

Base model

answerdotai/ModernBERT-base

Finetuned

(966)

this model

Dataset used to train skshmjn/CRAG-ModernBert

Evaluation results

Metadata error: specify a dataset to view leaderboard