metadata
library_name: sentence-transformers
metrics:
- negative_mse
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:25095
- loss:MSELoss
widget:
- source_sentence: mariknak pay ketdi a naabrasaak iti kulonganda
sentences:
- >-
Nakuha nako ang usa ka kuptanan sa istorya ug nagsugod kini sa pagbati
ug porma nga akong gusto
- >-
Ang kasarangang pag-ulan sa London, nga adunay kataas nga 10°C ug ang
ubos nga 6°C. #LondonWeather #RainyDay
- Controversial religious text causes uproar among community members
- source_sentence: >
JUAN COLE: Ang Pagduso sa Islamic State sa Baghdad 'Usa ka Pagsulay
Aron Mabawi ang Gikuha sa Bush Administration'
sentences:
- >-
Ang Touchdown nga Selebrasyon ni Antonio Brown Sexy Gihapon Alang sa NFL
Bisan ang duha ka pagduso makapasilo kanimo.
- >-
Natuklasan ng mga siyentipiko ang mga bagong species ng nilalang sa
malalim na dagat
- i feel so glad doing this
- source_sentence: New Curriculum Standards to Be Implemented in All Schools Next Year
sentences:
- |
Climate Change This Week: Mega Methane, Tidal Power, and More
- >-
@lilomatic Only in Zimbabwe where u find Opposition party for another
Opposition party.
- >
Ang mamumuno nga si Mike namulong sa Ferguson: 'Ang Hustisya Dili
Kanunay Gisilbi'
- source_sentence: i am so blessed and feel blessed to be able to share my creations with you
sentences:
- |
Ania ang Buhaton Sa World Cup Host Cities Gawas sa Pagtan-aw sa Soccer
- |
Hillary Clinton's 'Super Volunteers' Are Back And Ready For 2016
- >-
Awan pay ti koriente para kadagiti paset ti Joburg kalpasan ti uram ti
kable iti uneg ti daga https://t.co/szuZa380Lr
- source_sentence: |
3 Napateg nga Addang (iti Aniaman nga Edad) tapno Agsagana iti Matay
sentences:
- >-
EPIC! RAND PAUL Laughs at CNN’s Climate Hysteria…Schools Jake Tapper on
Climate Truth [Video]
- im feeling horrible
- 'Image: WC Provincial Disaster Management Centre https://t.co/EcNgpBhjcV'
model-index:
- name: SentenceTransformer
results:
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: Unknown
type: unknown
metrics:
- type: negative_mse
value: -0.2521140966564417
name: Negative Mse
SentenceTransformer
This is a sentence-transformers model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Maximum Sequence Length: 128 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'3 Napateg nga Addang (iti Aniaman nga Edad) tapno Agsagana iti Matay \n',
'EPIC! RAND PAUL Laughs at CNN’s Climate Hysteria…Schools Jake Tapper on Climate Truth [Video]',
'Image: WC Provincial Disaster Management Centre https://t.co/EcNgpBhjcV',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Knowledge Distillation
- Evaluated with
MSEEvaluator
| Metric | Value |
|---|---|
| negative_mse | -0.2521 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 25,095 training samples
- Columns:
sentence_0andlabel - Approximate statistics based on the first 1000 samples:
sentence_0 label type string list details - min: 4 tokens
- mean: 23.49 tokens
- max: 50 tokens
- size: 768 elements
- Samples:
sentence_0 label A suicide bomber targeting a crowded market resulting in numerous fatalities[-0.05337272211909294, -0.296869158744812, -0.005234384443610907, -0.017071111127734184, 0.01954558491706848, ...]Jeb Bush To Meet With Charleston Pastors[-0.025684779509902, 0.2293000966310501, -0.005389949772506952, 0.09448838979005814, 0.017471183091402054, ...]New scientific research suggests link between air pollution and lung disease[-0.12967786192893982, 0.19541345536708832, -0.0044404976069927216, -0.06291326135396957, -0.03776596114039421, ...] - Loss:
MSELoss
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 64per_device_eval_batch_size: 64num_train_epochs: 20multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 64per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 20max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseeval_use_gather_object: Falsebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin
Training Logs
| Epoch | Step | Training Loss | negative_mse |
|---|---|---|---|
| 0.5089 | 200 | - | -0.3720 |
| 1.0 | 393 | - | -0.3428 |
| 1.0178 | 400 | - | -0.3437 |
| 1.2723 | 500 | 0.0024 | - |
| 1.5267 | 600 | - | -0.3262 |
| 2.0 | 786 | - | -0.3153 |
| 2.0356 | 800 | - | -0.3156 |
| 2.5445 | 1000 | 0.0018 | -0.3070 |
| 3.0 | 1179 | - | -0.3004 |
| 3.0534 | 1200 | - | -0.3005 |
| 3.5623 | 1400 | - | -0.2959 |
| 3.8168 | 1500 | 0.0015 | - |
| 4.0 | 1572 | - | -0.2907 |
| 4.0712 | 1600 | - | -0.2924 |
| 4.5802 | 1800 | - | -0.2863 |
| 5.0 | 1965 | - | -0.2831 |
| 5.0891 | 2000 | 0.0013 | -0.2841 |
| 5.5980 | 2200 | - | -0.2792 |
| 6.0 | 2358 | - | -0.2765 |
| 6.1069 | 2400 | - | -0.2774 |
| 6.3613 | 2500 | 0.0012 | - |
| 6.6158 | 2600 | - | -0.2734 |
| 7.0 | 2751 | - | -0.2716 |
| 7.1247 | 2800 | - | -0.2722 |
| 7.6336 | 3000 | 0.0011 | -0.2700 |
| 8.0 | 3144 | - | -0.2684 |
| 8.1425 | 3200 | - | -0.2683 |
| 8.6514 | 3400 | - | -0.2665 |
| 8.9059 | 3500 | 0.001 | - |
| 9.0 | 3537 | - | -0.2645 |
| 9.1603 | 3600 | - | -0.2649 |
| 9.6692 | 3800 | - | -0.2639 |
| 10.0 | 3930 | - | -0.2625 |
| 10.1781 | 4000 | 0.0009 | -0.2619 |
| 10.6870 | 4200 | - | -0.2615 |
| 11.0 | 4323 | - | -0.2594 |
| 11.1959 | 4400 | - | -0.2598 |
| 11.4504 | 4500 | 0.0009 | - |
| 11.7048 | 4600 | - | -0.2587 |
| 12.0 | 4716 | - | -0.2582 |
| 12.2137 | 4800 | - | -0.2586 |
| 12.7226 | 5000 | 0.0008 | -0.2573 |
| 13.0 | 5109 | - | -0.2568 |
| 13.2316 | 5200 | - | -0.2567 |
| 13.7405 | 5400 | - | -0.2564 |
| 13.9949 | 5500 | 0.0008 | - |
| 14.0 | 5502 | - | -0.2558 |
| 14.2494 | 5600 | - | -0.2560 |
| 14.7583 | 5800 | - | -0.2551 |
| 15.0 | 5895 | - | -0.2548 |
| 15.2672 | 6000 | 0.0008 | -0.2552 |
| 15.7761 | 6200 | - | -0.2540 |
| 16.0 | 6288 | - | -0.2534 |
| 16.2850 | 6400 | - | -0.2538 |
| 16.5394 | 6500 | 0.0008 | - |
| 16.7939 | 6600 | - | -0.2529 |
| 17.0 | 6681 | - | -0.2532 |
| 17.3028 | 6800 | - | -0.2530 |
| 17.8117 | 7000 | 0.0008 | -0.2528 |
| 18.0 | 7074 | - | -0.2525 |
| 18.3206 | 7200 | - | -0.2527 |
| 18.8295 | 7400 | - | -0.2521 |
Framework Versions
- Python: 3.10.14
- Sentence Transformers: 3.1.1
- Transformers: 4.44.2
- PyTorch: 2.4.0
- Accelerate: 0.34.2
- Datasets: 3.0.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MSELoss
@inproceedings{reimers-2020-multilingual-sentence-bert,
title = "Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2020",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/2004.09813",
}