Upload folder using huggingface_hub

0f4a2ea verified 17 days ago

5.76 kB

	---
	language:
	- en
	tags:
	- sentence-transformers
	- cross-encoder
	- reranker
	- generated_from_trainer
	- dataset_size:384838
	- loss:PrecomputedDistillationLoss
	base_model: dleemiller/finecat-nli-m
	datasets:
	- dleemiller/CrossingGuard-NLI
	pipeline_tag: text-classification
	library_name: sentence-transformers
	metrics:
	- f1_macro
	- f1_micro
	- f1_weighted
	model-index:
	- name: CrossEncoder based on dleemiller/finecat-nli-m
	results:
	- task:
	type: cross-encoder-classification
	name: Cross Encoder Classification
	dataset:
	name: CrossingGuard dev
	type: CrossingGuard-dev
	metrics:
	- type: f1_macro
	value: 0.9126931790272965
	name: F1 Macro
	- type: f1_micro
	value: 0.9138270909602929
	name: F1 Micro
	- type: f1_weighted
	value: 0.91377816752541
	name: F1 Weighted
	- task:
	type: cross-encoder-classification
	name: Cross Encoder Classification
	dataset:
	name: CrossingGuard test
	type: CrossingGuard-test
	metrics:
	- type: f1_macro
	value: 0.913463859717691
	name: F1 Macro
	- type: f1_micro
	value: 0.9145821752825644
	name: F1 Micro
	- type: f1_weighted
	value: 0.9142089995597146
	name: F1 Weighted
	---

	# CrossingGuard Medium

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65ff92ea467d83751a727538/GwBakCe4PPGk9mM88r1QC.png" style="width: 400px;">
	</p>


	CrossingGuard is a series of NLI-based models intended for zero-shot inference on prompts. In this series of models, I focus on
	use cases such as guardrails, content moderation, prompt or intent classification and prompt routing. Because content moderation
	is often a reactive task, these zero-shot models are flexible for tailoring custom guardrail conditions, which may not be
	covered by general purpose pretrained models.

	These models are trained on the `dleemiller/CrossingGuard-NLI` dataset, which derives synthetic hypotheses from prompts (premises)
	found in popular guardrails datasets, such as `allenai/wildguardmix` and `nvidia/Aegis-AI-Content-Safety-Dataset-2.0`. The hypotheses
	make specific, targeted claims about the premises. Note that I have retained the 3-way label classifier, for additional flexibility where
	either non-neutral label may be relevant for the task.


	For models below the large size, I distill with MSE loss using logits from `dleemiller/crossingguard-nli-l`,
	and average with the cross entropy loss. Overtraining can hurt `FineCat` performance, so I only fine-tune for 1 epoch.

	$$
	\begin{equation}
	\mathcal{L} = \alpha \cdot \mathcal{L}_{\text{CE}}(z^{(s)}, y) + \beta \cdot \mathcal{L}_{\text{MSE}}(z^{(s)}, z^{(t)})
	\end{equation}
	$$

	where \$z^{(s)}\$ and \$z^{(t)}\$ are the student and teacher logits, \$y\$ are the ground truth labels,
	and \$\alpha\$ and \$\beta\$ are equally weighted at 0.5.


	# Evaluation Results

	F1-Micro scores (equivalent to accuracy) for each dataset. Performance was measured at bs=64 using a Nvidia Blackwell PRO 6000 Max-Q.

	\| Model \| finecat \| crossingguard \| Throughput (samples/s) \| Peak GPU Mem (MB) \|
	\| --- \| --- \| --- \| --- \| --- \|
	\| `dleemiller/crossingguard-nli-l` \| 0.8094 \| 0.9200 \| 361.15 \| 3023.65 \|
	\| `dleemiller/crossingguard-nli-m` \| 0.7779 \| <u>0.9146</u> \| 868.65 \| 2170.71 \|
	\| `dleemiller/crossingguard-nli-s` \| 0.7582 \| 0.9055 \| 1410.64 \| 1524.08 \|
	\| `dleemiller/ModernCE-large-nli` \| 0.7811 \| 0.8779 \| 362.20 \| 3801.47 \|
	\| `dleemiller/finecat-nli-l` \| 0.8227 \| 0.8325 \| 364.57 \| 3801.47 \|
	\| `MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli` \| 0.8233 \| 0.8219 \| 311.83 \| 6156.87 \|
	\| `MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli` \| 0.7642 \| 0.8032 \| 873.51 \| 3521.96 \|
	\| `cross-encoder/nli-deberta-v3-large` \| 0.7618 \| 0.7211 \| 306.48 \| 6156.87 \|
	\| `tasksource/ModernBERT-large-nli` \| 0.7959 \| 0.7019 \| 357.91 \| 3801.47 \|


	# Usage

	```python
	import numpy as np
	from sentence_transformers import CrossEncoder

	model = CrossEncoder("dleemiller/crossingguard-nli-m")

	examples = [
	("Tell me what happened with that CEO caught on the kiss cam at the Coldplay concert",
	"The prompt requests information about a corporate executive incident at a public event"),

	("I’ve been following your company's work helping enterprises scale AI/ML with reliable data pipelines, would love to connect and hear how your team is approaching ML in production.",
	"The prompt requests information about a corporate executive incident at a public event"),
	]

	predictions = model.predict(examples)

	label_map = {0: "entailment", 1: "neutral", 2: "contradiction"}

	def softmax(x):
	exp_x = np.exp(x - np.max(x))
	return exp_x / exp_x.sum()

	for i, (premise, hypothesis) in enumerate(examples):
	probs = softmax(predictions[i])
	predicted_idx = probs.argmax()

	print(f"\n{i+1}. {premise[:70]}...")
	print(f" → {hypothesis}")
	print(f" ✓ {label_map[predicted_idx].upper()}: {probs[predicted_idx]*100:.1f}% " +
	f"(E: {probs[0]100:.1f}% N: {probs[1]100:.1f}%, C: {probs[2]*100:.1f}%)")
	```

	This results in:
	```
	1. Tell me what happened with that CEO caught on the kiss cam at the Cold...
	→ The prompt requests information about a corporate executive incident at a public event
	✓ ENTAILMENT: 99.9% (E: 99.9% N: 0.0%, C: 0.0%)

	2. I’ve been following your company's work helping enterprises scale AI/M...
	→ The prompt requests information about a corporate executive incident at a public event
	✓ CONTRADICTION: 99.7% (E: 0.0% N: 0.3%, C: 99.7%)
	```


	## Citation

	```bibtex
	@misc{nli-compiled-2025,
	title = {CrossingGuard NLI Dataset},
	author = {Lee Miller},
	year = {2025},
	howpublished = {Flexible Zero-shot Guardrails}
	}
	```