--- license: apache-2.0 tags: - sentence-transformers - sentence-similarity - feature-extraction - loss:ContrastiveLoss base_model: FacebookAI/xlm-roberta-large pipeline_tag: sentence-similarity datasets: - gabrielloiseau/CALE-SPCD --- # CALE-XLM-R This is a [sentence-transformers](https://www.SBERT.net) model: It maps occurences of a word to a 1024 dimensional dense vector space and can be used for tasks like clustering or semantic search. ## Usage (Sentence-Transformers) ``` pip install -U sentence-transformers ``` Then you can use the model like this: ```python from sentence_transformers import SentenceTransformer # 1. Load CALE model model = SentenceTransformer("gabrielloiseau/CALE-XLM-R") sentences = [ "the boy could easily distinguish the different note values", "he patient’s ability to recognize forms and shapes", "the government had refused to recognize their autonomy and existence as a state", ] # 2. Calculate embeddings embeddings = model.encode(sentences) print(embeddings.shape) # [3, 1024] # 3. Calculate the embedding similarities similarities = model.similarity(embeddings, embeddings) print(similarities) # tensor([[1.0000, 0.9332, 0.5331], # [0.9332, 1.0000, 0.5619], # [0.5331, 0.5619, 1.0000]]) ``` ## Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'}) (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False}) ) ```