CRAG-ModernBert

CRAG-ModernBert is a fine-tuned version of answerdotai/ModernBERT-base specifically trained for CRAG grading, a task that involves determining whether a given document is relevant or not relevant to a particular query or context. This is a binary sequence classification task. It achieves the following results on the evaluation set:

  • Loss: 0.2827
  • Accuracy: 0.8721
  • F1: 0.8723
  • Precision: 0.8728
  • Recall: 0.8721
  • Mcc: 0.7421

Model description

This model leverages the ModernBERT architecture to classify documents based on their relevance using data from the skshmjn/CRAG-EVAL dataset. It is trained to perform well on information retrieval and filtering tasks, particularly for applications where automated document triage or validation is required. CRAG-EVAL

Intended uses & limitations

CRAG Evaluation Pipelines: For automated assessment of whether retrieved documents match expected query criteria.

Information Retrieval Re-ranking: As a re-ranker model to filter relevant documents from top-N retrieved candidates.

Enterprise Search and Document Classification: Classifying internal documentation, compliance papers, or customer feedback for relevancy.

Training and evaluation data

image/jpeg

The model was trained and evaluated using the CRAG-EVAL dataset, which contains documents labeled for relevance classification.

  • Training Split: 80% of the dataset
  • Validation Split: 10% of the dataset
  • Test Split: 10% of the dataset

The dataset includes pairs of queries(or question) and documents(or contexts), each labeled as either relevant (1) or not relevant (0). The model learns to classify the relevance based on these sentence pairs. To better understand the distribution of the similarity_score across relevance classes, a box plot was created comparing relevant and not relevant documents. This visualization helps illustrate the score patterns and outliers for each class, providing further insight into the data and model performance.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Precision Recall Mcc
0.2676 1.0 3294 0.2773 0.8758 0.8759 0.8762 0.8758 0.7492

Framework versions

  • Transformers 4.51.0.dev0
  • Pytorch 2.4.1+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1

🧾 Example Usage (Query + Document)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "your-username/CRAG-ModernBert"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example input: context/query and document
query = "What are the effects of climate change on urban infrastructure?"
document = "This report outlines the critical impacts of climate change on water resources in urban areas."

# Tokenize as sentence pair
inputs = tokenizer(query, document, return_tensors="pt", truncation=True, padding=True)
with torch.no_grad():
    logits = model(**inputs).logits
    predicted_class = torch.argmax(logits, dim=1).item()

print("Relevant" if predicted_class == 1 else "Not Relevant")
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for skshmjn/CRAG-ModernBert

Finetuned
(966)
this model

Dataset used to train skshmjn/CRAG-ModernBert