MiniLM-L12-v2 Embed base Legal Public Deed - Matryoshka (ONNX Quantized)
This is an ONNX quantized version of the MiniLM-L12-v2 Embed base Legal Public Deed - Matryoshka model. It has been optimized for faster inference and a smaller file size, while maintaining comparable performance to the original model.
This version is recommended for production deployments and applications where latency and memory footprint are critical, especially when running inference on CPUs or resource-constrained devices.
Model Details
Quantization Architecture
- Quantization Type: Int8 (8-bit quantization).
- Model Size: Significantly reduced compared to the original float32 model.
Model Description
- Model Type: Sentence Transformer (ONNX)
- Base model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
- Maximum Sequence Length: 128 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
- Language: es
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
- Optimum: Hugging Face Optimum
Usage
To use this quantized ONNX model with the sentence-transformers library, make sure you have the necessary dependencies installed:
pip install -U sentence-transformers optimum onnx onnxruntime
- Downloads last month
- 2
Model tree for luiggy2620/modernbert-onnx
Evaluation results
- Cosine Accuracy@1 on dim 384self-reported0.013
- Cosine Accuracy@3 on dim 384self-reported0.066
- Cosine Accuracy@5 on dim 384self-reported0.171
- Cosine Accuracy@10 on dim 384self-reported0.289
- Cosine Precision@1 on dim 384self-reported0.013
- Cosine Precision@3 on dim 384self-reported0.022
- Cosine Precision@5 on dim 384self-reported0.034
- Cosine Precision@10 on dim 384self-reported0.029
- Cosine Recall@1 on dim 384self-reported0.013
- Cosine Recall@3 on dim 384self-reported0.066