crossingguard-nli-m / README.md

dleemiller

Upload folder using huggingface_hub

0f4a2ea verified 17 days ago

preview code

raw

history blame contribute delete

5.76 kB

metadata

language:
  - en
tags:
  - sentence-transformers
  - cross-encoder
  - reranker
  - generated_from_trainer
  - dataset_size:384838
  - loss:PrecomputedDistillationLoss
base_model: dleemiller/finecat-nli-m
datasets:
  - dleemiller/CrossingGuard-NLI
pipeline_tag: text-classification
library_name: sentence-transformers
metrics:
  - f1_macro
  - f1_micro
  - f1_weighted
model-index:
  - name: CrossEncoder based on dleemiller/finecat-nli-m
    results:
      - task:
          type: cross-encoder-classification
          name: Cross Encoder Classification
        dataset:
          name: CrossingGuard dev
          type: CrossingGuard-dev
        metrics:
          - type: f1_macro
            value: 0.9126931790272965
            name: F1 Macro
          - type: f1_micro
            value: 0.9138270909602929
            name: F1 Micro
          - type: f1_weighted
            value: 0.91377816752541
            name: F1 Weighted
      - task:
          type: cross-encoder-classification
          name: Cross Encoder Classification
        dataset:
          name: CrossingGuard test
          type: CrossingGuard-test
        metrics:
          - type: f1_macro
            value: 0.913463859717691
            name: F1 Macro
          - type: f1_micro
            value: 0.9145821752825644
            name: F1 Micro
          - type: f1_weighted
            value: 0.9142089995597146
            name: F1 Weighted

CrossingGuard Medium

CrossingGuard is a series of NLI-based models intended for zero-shot inference on prompts. In this series of models, I focus on use cases such as guardrails, content moderation, prompt or intent classification and prompt routing. Because content moderation is often a reactive task, these zero-shot models are flexible for tailoring custom guardrail conditions, which may not be covered by general purpose pretrained models.

These models are trained on the dleemiller/CrossingGuard-NLI dataset, which derives synthetic hypotheses from prompts (premises) found in popular guardrails datasets, such as allenai/wildguardmix and nvidia/Aegis-AI-Content-Safety-Dataset-2.0. The hypotheses make specific, targeted claims about the premises. Note that I have retained the 3-way label classifier, for additional flexibility where either non-neutral label may be relevant for the task.

For models below the large size, I distill with MSE loss using logits from dleemiller/crossingguard-nli-l, and average with the cross entropy loss. Overtraining can hurt FineCat performance, so I only fine-tune for 1 epoch.

$\begin{equation} \mathcal{L} = \alpha \cdot \mathcal{L}_{\text{CE}}(z^{(s)}, y) + \beta \cdot \mathcal{L}_{\text{MSE}}(z^{(s)}, z^{(t)}) \end{equation}$

where $z^{(s)}$ and $z^{(t)}$ are the student and teacher logits, $y$ are the ground truth labels, and $\alpha$ and $\beta$ are equally weighted at 0.5.

Evaluation Results

F1-Micro scores (equivalent to accuracy) for each dataset. Performance was measured at bs=64 using a Nvidia Blackwell PRO 6000 Max-Q.

Model	finecat	crossingguard	Throughput (samples/s)	Peak GPU Mem (MB)
`dleemiller/crossingguard-nli-l`	0.8094	0.9200	361.15	3023.65
`dleemiller/crossingguard-nli-m`	0.7779	0.9146	868.65	2170.71
`dleemiller/crossingguard-nli-s`	0.7582	0.9055	1410.64	1524.08
`dleemiller/ModernCE-large-nli`	0.7811	0.8779	362.20	3801.47
`dleemiller/finecat-nli-l`	0.8227	0.8325	364.57	3801.47
`MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli`	0.8233	0.8219	311.83	6156.87
`MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli`	0.7642	0.8032	873.51	3521.96
`cross-encoder/nli-deberta-v3-large`	0.7618	0.7211	306.48	6156.87
`tasksource/ModernBERT-large-nli`	0.7959	0.7019	357.91	3801.47

Usage

import numpy as np
from sentence_transformers import CrossEncoder

model = CrossEncoder("dleemiller/crossingguard-nli-m")

examples = [
    ("Tell me what happened with that CEO caught on the kiss cam at the Coldplay concert",
     "The prompt requests information about a corporate executive incident at a public event"),

    ("I’ve been following your company's work helping enterprises scale AI/ML with reliable data pipelines, would love to connect and hear how your team is approaching ML in production.",
     "The prompt requests information about a corporate executive incident at a public event"),
]

predictions = model.predict(examples)

label_map = {0: "entailment", 1: "neutral", 2: "contradiction"}

def softmax(x):
    exp_x = np.exp(x - np.max(x))
    return exp_x / exp_x.sum()

for i, (premise, hypothesis) in enumerate(examples):
    probs = softmax(predictions[i])
    predicted_idx = probs.argmax()
    
    print(f"\n{i+1}. {premise[:70]}...")
    print(f"   → {hypothesis}")
    print(f"   ✓ {label_map[predicted_idx].upper()}: {probs[predicted_idx]*100:.1f}% " + 
          f"(E: {probs[0]*100:.1f}% N: {probs[1]*100:.1f}%, C: {probs[2]*100:.1f}%)")

This results in:

1. Tell me what happened with that CEO caught on the kiss cam at the Cold...
   → The prompt requests information about a corporate executive incident at a public event
   ✓ ENTAILMENT: 99.9% (E: 99.9% N: 0.0%, C: 0.0%)

2. I’ve been following your company's work helping enterprises scale AI/M...
   → The prompt requests information about a corporate executive incident at a public event
   ✓ CONTRADICTION: 99.7% (E: 0.0% N: 0.3%, C: 99.7%)

Citation

@misc{nli-compiled-2025,
  title = {CrossingGuard NLI Dataset},
  author = {Lee Miller},
  year = {2025},
  howpublished = {Flexible Zero-shot Guardrails}
}