CRAG-ModernBert
CRAG-ModernBert is a fine-tuned version of answerdotai/ModernBERT-base specifically trained for CRAG grading, a task that involves determining whether a given document is relevant or not relevant to a particular query or context. This is a binary sequence classification task. It achieves the following results on the evaluation set:
- Loss: 0.2827
- Accuracy: 0.8721
- F1: 0.8723
- Precision: 0.8728
- Recall: 0.8721
- Mcc: 0.7421
Model description
This model leverages the ModernBERT architecture to classify documents based on their relevance using data from the skshmjn/CRAG-EVAL dataset. It is trained to perform well on information retrieval and filtering tasks, particularly for applications where automated document triage or validation is required. CRAG-EVAL
Intended uses & limitations
CRAG Evaluation Pipelines: For automated assessment of whether retrieved documents match expected query criteria.
Information Retrieval Re-ranking: As a re-ranker model to filter relevant documents from top-N retrieved candidates.
Enterprise Search and Document Classification: Classifying internal documentation, compliance papers, or customer feedback for relevancy.
Training and evaluation data
The model was trained and evaluated using the CRAG-EVAL dataset, which contains documents labeled for relevance classification.
- Training Split: 80% of the dataset
- Validation Split: 10% of the dataset
- Test Split: 10% of the dataset
The dataset includes pairs of queries(or question) and documents(or contexts), each labeled as either relevant (1) or not relevant (0). The model learns to classify the relevance based on these sentence pairs.
To better understand the distribution of the similarity_score across relevance classes, a box plot was created comparing relevant and not relevant documents. This visualization helps illustrate the score patterns and outliers for each class, providing further insight into the data and model performance.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 1
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall | Mcc |
|---|---|---|---|---|---|---|---|---|
| 0.2676 | 1.0 | 3294 | 0.2773 | 0.8758 | 0.8759 | 0.8762 | 0.8758 | 0.7492 |
Framework versions
- Transformers 4.51.0.dev0
- Pytorch 2.4.1+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
🧾 Example Usage (Query + Document)
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_name = "your-username/CRAG-ModernBert"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Example input: context/query and document
query = "What are the effects of climate change on urban infrastructure?"
document = "This report outlines the critical impacts of climate change on water resources in urban areas."
# Tokenize as sentence pair
inputs = tokenizer(query, document, return_tensors="pt", truncation=True, padding=True)
with torch.no_grad():
logits = model(**inputs).logits
predicted_class = torch.argmax(logits, dim=1).item()
print("Relevant" if predicted_class == 1 else "Not Relevant")
- Downloads last month
- -
Model tree for skshmjn/CRAG-ModernBert
Base model
answerdotai/ModernBERT-base