SentenceTransformer based on thebajajra/RexBERT-base-embed-pf-v0.2
This is a sentence-transformers model finetuned from thebajajra/RexBERT-base-embed-pf-v0.2 on the nomic-embed-supervised-data dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: thebajajra/RexBERT-base-embed-pf-v0.2
- Maximum Sequence Length: 1024 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 1024, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
queries = [
"how much does a vending machine cost",
]
documents = [
"Confidence votes 5.7K. A vending machine costs anywhere from $100 (very compact, 4-8 small selections) to $20,000 (large, varied, many selections, refrigerated). It depends what vending machines. If it's dirty or in poor shape, then the price will drop down. And it depends on what style.",
"2018 Idaho gubernatorial election enter the race.</nowiki> 2018 Idaho gubernatorial election The 2018 Idaho gubernatorial election took place on November 6 to elect the next Governor of Idaho. Incumbent Republican Governor Butch Otter chose not to run for a fourth term, and the state's primaries were held on May 15. Former state representative Paulette Jordan was the Democratic Party's nominee, who lost to incumbent lieutenant governor Brad Little by a wide margin for a seventh consecutive Republican victory. A record 605,131 votes were cast for governor in 2018, a 37.6% increase over the previous election in 2014 was 452,535 votes in 2010. <nowiki>*–Denotes candidates",
"Review: With all due respect to ambient music enthusiasts, I was really disappointed that there was no guitar work whatsoever on this album. Hillage fans of L and Fish Rising be forewarned.Steve Hillage was a pretty darn good guitarist. Maybe L was his showcase with members of Todd Rundgren's Utopia backing him up.Noting that other reviewers have rated this highly, I will give it another listen. However, I am dissapointed in the direction Steve has taken his music. \n Polarity: Negative",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.8921, 0.0941, -0.0030]])
Training Details
Training Dataset
nomic-embed-supervised-data
- Dataset: nomic-embed-supervised-data at 13eef8a
- Size: 1,687,337 training samples
- Columns:
query,document, andnegative - Approximate statistics based on the first 1000 samples:
query document negative type string string list details - min: 4 tokens
- mean: 38.9 tokens
- max: 1024 tokens
- min: 5 tokens
- mean: 96.45 tokens
- max: 1024 tokens
- min: 20 elements
- mean: 159.79 elements
- max: 209 elements
- Samples:
query document negative I think I used to live in a twenty-one B.I used to live in 21-B, if I remember correctly.['I never lived in 21-B, ever.', 'I did not much care for the Total Recall remake. ', 'A young couple kissing on a park bench.', 'It is possible that the pay caps on management salaries can be removed.', 'His name is Robertson', ...]the holder of this passport is not entitled to travel to occupied palestineIranian passport Persian and near all of them are also provided in English. As Iran (the Islamic Republic of) does not recognize nor have diplomatic relations with the state of Israel (like some other Muslim countries), people using an Iranian passport are not permitted to travel to Israel under Iranian law (although Israel itself does admit Iranian citizens holding a visa). On the inside of the back-cover, Iranian passports bear the inscription: "The holder of this passport is not entitled to travel to occupied Palestine", referring to Israel. As of 26 August 2017, Iranian citizens had visa-free or visa on arrival access['Israeli nationality law from an administrative court to cancel it. A 2008 amendment to the "Nationality Law of 1952" designated nine countries as enemy states: Afghanistan, Iran, Iraq, Lebanon, Libya, Pakistan, Sudan, Syria, and Yemen, as well as the Gaza Strip. Per article 10 of the citizenship act, Israeli citizens living abroad renounce their Israeli citizenship by filing an application with an Israeli embassy. The application is transferred to the Administration of Border Crossings, Population and Immigration, acting on behalf of the Minister of Interior, which reviews and either grants or rejects the request. The request may be denied for any reason, such', 'Non-visa travel restrictions country. In non-diplomatic use, the authorities of a country may also declare a foreigner "persona non grata" permanently or temporarily, usually because of unlawful activity. Attempts to enter the Gaza strip by sea may attract a 10-year ban on entering Israel. Several countries mandate that all ...Which magazine was published first, La Belle Assemblée or Mademoiselle?Mademoiselle (magazine) Mademoiselle was a women's magazine first published in 1935 by Street and Smith and later acquired by Condé Nast Publications.["La Belle Assemblée La Belle Assemblée (in full La Belle Assemblée or, Bell's Court and Fashionable Magazine Addressed Particularly to the Ladies) was a British women's magazine published from 1806 to 1837, founded by John Bell (1745–1831).", 'La Semaine de Suzette La Semaine de Suzette was a French magazine aimed at girls, which appeared from 1905 until 1960. It contained early comics like "Bécassine".', 'Mademoiselle Marie Mademoiselle Marie (often shortened to Mlle. Marie) is the name of two fictional characters appearing in comic books published by DC Comics. She first appeared in "Star Spangled War Stories" #84 (August 1959), and was created by Robert Kanigher and Jerry Grandenetti. She was based in part on several actual members of the French resistance, most notably Simone Segouin.', 'Le Rire Le Rire (] , "Laughter") was a successful French humor magazine published from October 1894 through the 1950s. Founded in Paris during the Belle Époque by Felix Juven, "Le Rire" appeared a... - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Evaluation Dataset
nomic-embed-supervised-data
- Dataset: nomic-embed-supervised-data at 13eef8a
- Size: 8,482 evaluation samples
- Columns:
query,document, andnegative - Approximate statistics based on the first 1000 samples:
query document negative type string string list details - min: 4 tokens
- mean: 40.11 tokens
- max: 992 tokens
- min: 4 tokens
- mean: 100.61 tokens
- max: 1024 tokens
- min: 20 elements
- mean: 156.57 elements
- max: 209 elements
- Samples:
query document negative cost allocation methodologyJuly 01, 2013/. Cost allocation is the process of identifying, aggregating, and assigning costs to cost objects. A cost object is any activity or item for which youwant to separately measure costs. Examples of cost objects are a product, a research project, a customer, a sales region, and a department.['Cost allocation is the process of identifying, aggregating, and assigning costs to cost objects. A cost object is any activity or item for which you want to separately measure costs.Examples of cost objects are a product, a research project, a customer, a sales region, and a department.Cost allocation is used for financial reporting purposes, to spread costs among departments or inventory items.Cost allocation is also used in the calculation of profitability at the department or subsidiary level, which in turn may be used as the basis for bonuses or the funding of additional activities.xamples of cost objects are a product, a research project, a customer, a sales region, and a department. Cost allocation is used for financial reporting purposes, to spread costs among departments or inventory items.', 'Cost allocation is the process of identifying, aggregating, and assigning costs to cost objects. A cost object is any activity or item for which you want to separately measure costs. Ex...Trump made them sign hardcore gag orders though. They'd have to do it anonymouslyI didn't write the book. Rusty Shackleford did!['Uh, hrm. Man, people really need to learn how to write what they intend to convey. Is it me or are people just getting worse at this over time?', 'COMMENT LINKING TO "ORIGINAL" WHICH WAS MADE AFTER THIS POST', 'Not to mention all the other toys like bass boats that so many have', 'Ah. So this was just a coincidence.', 'They were filming the ass load of ducks right there in front of your face. Looks like they were all crossing the sidewalk. I would’ve been filming too. Why is this even top comment?', ...]Russell Crowe's only film role was Scrooge McDuck.Russell Crowe Russell Ira Crowe ( born 7 April 1964 ) is an actor , film producer and musician . Although a New Zealand citizen , he has lived most of his life in Australia . He came to international attention for his role as the Roman General Maximus Decimus Meridius in the 2000 historical epic film Gladiator , directed by Ridley Scott , for which Crowe won an Academy Award for Best Actor , a Broadcast Film Critics Association Award for Best Actor , an Empire Award for Best Actor and a London Film Critics Circle Award for Best Actor and 10 further nominations for best actor . Crowe appeared as the tobacco firm whistle blower Jeffrey Wigand in the 1999 film The Insider , for which he received five awards as best actor and seven nominations in the same category . In 2001 , Crowe 's portrayal of mathematician and Nobel Prize winner John F. Nash in the biopic A Beautiful Mind brought him numerous awards , including a BAFTA Award for Best Actor in a Leading Role , a Golden Globe Award fo...['Russell Crowe filmography This is the complete filmography of Russell Crowe throughout his entire life . Crowe has acted in blockbuster films like Gladiator , a 2000 historical epic film , for which he won the Academy Award for Best Actor . He is also a BAFTA Award winner for his role in a 2001 biographical drama A Beautiful Mind .', "Kurt Russell Kurt Vogel Russell ( born March 17 , 1951 ) is an American actor . He began acting on television in the western series The Travels of Jaimie McPheeters ( 1963 -- 64 ) . In the late 1960s , he signed a ten-year contract with The Walt Disney Company where , according to Robert Osborne , he became the studio 's top star of the 1970s . Russell was nominated for a Golden Globe Award for Best Supporting Actor -- Motion Picture for his performance in Silkwood ( 1983 ) . During the 1980s , he starred in several films by director John Carpenter , including anti-hero roles such as army hero-turned-robber Snake Plissken in the futuristic action film... - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 256per_device_eval_batch_size: 128learning_rate: 0.0001num_train_epochs: 10warmup_steps: 1000bf16: Truedataloader_num_workers: 20dataloader_prefetch_factor: 4ddp_find_unused_parameters: False
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 256per_device_eval_batch_size: 128per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 0.0001weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 1000log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Truedataloader_num_workers: 20dataloader_prefetch_factor: 4past_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Falseddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
| Epoch | Step | Training Loss | Validation Loss |
|---|---|---|---|
| 0.1215 | 100 | 0.6721 | - |
| 0.2430 | 200 | 0.231 | - |
| 0.2503 | 206 | - | 0.1449 |
| 0.3645 | 300 | 0.2047 | - |
| 0.4860 | 400 | 0.1959 | - |
| 0.5006 | 412 | - | 0.1316 |
| 0.6075 | 500 | 0.1872 | - |
| 0.7290 | 600 | 0.1834 | - |
| 0.7509 | 618 | - | 0.1271 |
| 0.8505 | 700 | 0.187 | - |
| 0.9721 | 800 | 0.1799 | - |
| 1.0012 | 824 | - | 0.1216 |
| 1.0936 | 900 | 0.1566 | - |
| 1.2151 | 1000 | 0.1529 | - |
| 1.2515 | 1030 | - | 0.1197 |
Framework Versions
- Python: 3.11.13
- Sentence Transformers: 5.1.2
- Transformers: 4.57.1
- PyTorch: 2.8.0+cu129
- Accelerate: 1.11.0
- Datasets: 4.3.0
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 35
Model tree for thebajajra/RexBERT-base-embed-pf-v0.3
Base model
thebajajra/RexBERT-base
Finetuned
thebajajra/RexBERT-base-embed-pf-v0.2