Lettria/debug_finetuning_model
Browse files- README.md +29 -69
- eval/binary_classification_evaluation_BinaryClassifEval_results.csv +3 -2
- eval/similarity_evaluation_EmbeddingSimEval_results.csv +3 -2
- model.safetensors +1 -1
- runs/Feb26_12-20-27_algo-1/events.out.tfevents.1740572428.algo-1.63.0 +3 -0
- runs/Feb26_12-20-27_algo-1/events.out.tfevents.1740572446.algo-1.63.1 +3 -0
- training_args.bin +2 -2
README.md
CHANGED
|
@@ -1,9 +1,6 @@
|
|
| 1 |
---
|
| 2 |
base_model: BAAI/bge-base-en-v1.5
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
library_name: sentence-transformers
|
| 6 |
-
license: apache-2.0
|
| 7 |
metrics:
|
| 8 |
- pearson_cosine
|
| 9 |
- spearman_cosine
|
|
@@ -23,44 +20,8 @@ tags:
|
|
| 23 |
- generated_from_trainer
|
| 24 |
- dataset_size:3696
|
| 25 |
- loss:MultipleNegativesRankingLoss
|
| 26 |
-
widget:
|
| 27 |
-
- source_sentence: Quel est le montant du cofinancement que la Région IDF propose
|
| 28 |
-
pour une allocation doctorale ?
|
| 29 |
-
sentences:
|
| 30 |
-
- sur des projets comportant une dimension numérique sur les thématiques ci-dessous
|
| 31 |
-
détaillées dans le texte de l'appel à projets :A - Économie circulaire,B - Cancer
|
| 32 |
-
pédiatrique,C - Autisme,D - Santé environnementale,E - Vieillissement
|
| 33 |
-
- 'bénéficiaires: Le dispositif est ouvert aux réseaux structurants qui fédèrent
|
| 34 |
-
des professionnels et des acteurs du secteur du patrimoine : associations et fondations.
|
| 35 |
-
Les effectifs d’adhérents doivent être représentatifs à l’échelle du territoire
|
| 36 |
-
francilien soit sur le plan géographique avec une présence significative (de départements
|
| 37 |
-
franciliens, de nombre d’adhérents). Peuvent être bénéficiaires les personnes
|
| 38 |
-
morales de droit privé ayant au moins 1 an d’existence'
|
| 39 |
-
- La Région cofinance entre 100.000€ et 120.000€ maximum des allocations de recherche
|
| 40 |
-
doctorale de 36 mois sur des projets comportant une dimension numérique
|
| 41 |
-
- source_sentence: Quel type de projets la Région Île-de-France subventionne-t-elle
|
| 42 |
-
pour valoriser le patrimoine culturel ?
|
| 43 |
-
sentences:
|
| 44 |
-
- 'Le dispositif est accessible à tous les OFA sous réserve de remplir les 5 conditions
|
| 45 |
-
suivantes : Dispenser une activité apprentissage ayant obtenu une certification,Dispenser
|
| 46 |
-
des formations en apprentissage sur le territoire francilien depuis au moins 1
|
| 47 |
-
an en qualité de CFA, d’OFA ou d’UFA,Présenter un projet d’investissement prévu
|
| 48 |
-
pour la dispense de formations en apprentissage sur le territoire francilien,Être
|
| 49 |
-
propriétaire du bien pour lequel une subvention est sollicitée ou titulaire d’un
|
| 50 |
-
bail récemment renouvelé (ou engagement du propriétaire à renouveler le bail),
|
| 51 |
-
en propre ou sous la forme de SCI, et assurant la maîtrise d’ouvrage des travaux
|
| 52 |
-
d’investissement,Présenter un besoin de financement sur le projet d’investissement
|
| 53 |
-
ne pouvant être pris en charge au titre des fonds propres de la structure et de
|
| 54 |
-
tiers financeurs'
|
| 55 |
-
- Jeunes scientifiques réalisant leur doctorat partagé entre un établissement d'enseignement
|
| 56 |
-
supérieur de recherche et une structure du monde socio-économique établis en Île-de-France
|
| 57 |
-
- 'Type de project: Actions de valorisation du patrimoine (expos physiques ou virtuelles,
|
| 58 |
-
journées d’étude, site Internet, publications, documentaires…),Outils de médiation (cartes
|
| 59 |
-
et itinéraires papier ou numériques, livrets de visite, multimédia, parcours d’interprétation…),Dispositifs
|
| 60 |
-
pédagogiques (mallettes pédagogiques, Moocs, supports de visite pour les jeunes…),Événements
|
| 61 |
-
avec forte dimension patrimoniale, rayonnants à l’échelle de l’Île-de-France'
|
| 62 |
model-index:
|
| 63 |
-
- name:
|
| 64 |
results:
|
| 65 |
- task:
|
| 66 |
type: semantic-similarity
|
|
@@ -83,22 +44,22 @@ model-index:
|
|
| 83 |
type: BinaryClassifEval
|
| 84 |
metrics:
|
| 85 |
- type: cosine_accuracy
|
| 86 |
-
value: 0.
|
| 87 |
name: Cosine Accuracy
|
| 88 |
- type: cosine_accuracy_threshold
|
| 89 |
-
value: 0.
|
| 90 |
name: Cosine Accuracy Threshold
|
| 91 |
- type: cosine_f1
|
| 92 |
-
value: 0.
|
| 93 |
name: Cosine F1
|
| 94 |
- type: cosine_f1_threshold
|
| 95 |
-
value: 0.
|
| 96 |
name: Cosine F1 Threshold
|
| 97 |
- type: cosine_precision
|
| 98 |
value: 1.0
|
| 99 |
name: Cosine Precision
|
| 100 |
- type: cosine_recall
|
| 101 |
-
value: 0.
|
| 102 |
name: Cosine Recall
|
| 103 |
- type: cosine_ap
|
| 104 |
value: 1.0
|
|
@@ -108,7 +69,7 @@ model-index:
|
|
| 108 |
name: Cosine Mcc
|
| 109 |
---
|
| 110 |
|
| 111 |
-
#
|
| 112 |
|
| 113 |
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
| 114 |
|
|
@@ -122,8 +83,8 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [B
|
|
| 122 |
- **Similarity Function:** Cosine Similarity
|
| 123 |
- **Training Dataset:**
|
| 124 |
- json
|
| 125 |
-
- **Language:**
|
| 126 |
-
- **License:**
|
| 127 |
|
| 128 |
### Model Sources
|
| 129 |
|
|
@@ -159,9 +120,9 @@ from sentence_transformers import SentenceTransformer
|
|
| 159 |
model = SentenceTransformer("model")
|
| 160 |
# Run inference
|
| 161 |
sentences = [
|
| 162 |
-
'
|
| 163 |
-
'
|
| 164 |
-
'
|
| 165 |
]
|
| 166 |
embeddings = model.encode(sentences)
|
| 167 |
print(embeddings.shape)
|
|
@@ -218,12 +179,12 @@ You can finetune this model on your own dataset.
|
|
| 218 |
|
| 219 |
| Metric | Value |
|
| 220 |
|:--------------------------|:--------|
|
| 221 |
-
| cosine_accuracy | 0.
|
| 222 |
-
| cosine_accuracy_threshold | 0.
|
| 223 |
-
| cosine_f1 | 0.
|
| 224 |
-
| cosine_f1_threshold | 0.
|
| 225 |
| cosine_precision | 1.0 |
|
| 226 |
-
| cosine_recall | 0.
|
| 227 |
| **cosine_ap** | **1.0** |
|
| 228 |
| cosine_mcc | 0.0 |
|
| 229 |
|
|
@@ -249,10 +210,10 @@ You can finetune this model on your own dataset.
|
|
| 249 |
* Size: 3,696 training samples
|
| 250 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 251 |
* Approximate statistics based on the first 1000 samples:
|
| 252 |
-
| | sentence1 | sentence2
|
| 253 |
-
|
| 254 |
-
| type | string | string
|
| 255 |
-
| details | <ul><li>min:
|
| 256 |
* Samples:
|
| 257 |
| sentence1 | sentence2 | label |
|
| 258 |
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
|
@@ -278,7 +239,7 @@ You can finetune this model on your own dataset.
|
|
| 278 |
| | sentence1 | sentence2 | label |
|
| 279 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------|
|
| 280 |
| type | string | string | int |
|
| 281 |
-
| details | <ul><li>min:
|
| 282 |
* Samples:
|
| 283 |
| sentence1 | sentence2 | label |
|
| 284 |
|:----------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
|
@@ -299,12 +260,11 @@ You can finetune this model on your own dataset.
|
|
| 299 |
- `eval_strategy`: epoch
|
| 300 |
- `per_device_train_batch_size`: 2
|
| 301 |
- `per_device_eval_batch_size`: 2
|
| 302 |
-
- `num_train_epochs`:
|
| 303 |
- `lr_scheduler_type`: cosine
|
| 304 |
- `warmup_ratio`: 0.1
|
| 305 |
- `bf16`: True
|
| 306 |
- `tf32`: True
|
| 307 |
-
- `load_best_model_at_end`: True
|
| 308 |
- `optim`: adamw_torch_fused
|
| 309 |
- `batch_sampler`: no_duplicates
|
| 310 |
|
|
@@ -328,7 +288,7 @@ You can finetune this model on your own dataset.
|
|
| 328 |
- `adam_beta2`: 0.999
|
| 329 |
- `adam_epsilon`: 1e-08
|
| 330 |
- `max_grad_norm`: 1.0
|
| 331 |
-
- `num_train_epochs`:
|
| 332 |
- `max_steps`: -1
|
| 333 |
- `lr_scheduler_type`: cosine
|
| 334 |
- `lr_scheduler_kwargs`: {}
|
|
@@ -368,7 +328,7 @@ You can finetune this model on your own dataset.
|
|
| 368 |
- `disable_tqdm`: False
|
| 369 |
- `remove_unused_columns`: True
|
| 370 |
- `label_names`: None
|
| 371 |
-
- `load_best_model_at_end`:
|
| 372 |
- `ignore_data_skip`: False
|
| 373 |
- `fsdp`: []
|
| 374 |
- `fsdp_min_num_params`: 0
|
|
@@ -430,11 +390,11 @@ You can finetune this model on your own dataset.
|
|
| 430 |
</details>
|
| 431 |
|
| 432 |
### Training Logs
|
| 433 |
-
| Epoch
|
| 434 |
-
|
| 435 |
-
|
|
|
|
|
| 436 |
|
| 437 |
-
* The bold row denotes the saved checkpoint.
|
| 438 |
|
| 439 |
### Framework Versions
|
| 440 |
- Python: 3.11.9
|
|
|
|
| 1 |
---
|
| 2 |
base_model: BAAI/bge-base-en-v1.5
|
|
|
|
|
|
|
| 3 |
library_name: sentence-transformers
|
|
|
|
| 4 |
metrics:
|
| 5 |
- pearson_cosine
|
| 6 |
- spearman_cosine
|
|
|
|
| 20 |
- generated_from_trainer
|
| 21 |
- dataset_size:3696
|
| 22 |
- loss:MultipleNegativesRankingLoss
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
model-index:
|
| 24 |
+
- name: SentenceTransformer based on BAAI/bge-base-en-v1.5
|
| 25 |
results:
|
| 26 |
- task:
|
| 27 |
type: semantic-similarity
|
|
|
|
| 44 |
type: BinaryClassifEval
|
| 45 |
metrics:
|
| 46 |
- type: cosine_accuracy
|
| 47 |
+
value: 0.8
|
| 48 |
name: Cosine Accuracy
|
| 49 |
- type: cosine_accuracy_threshold
|
| 50 |
+
value: 0.652718186378479
|
| 51 |
name: Cosine Accuracy Threshold
|
| 52 |
- type: cosine_f1
|
| 53 |
+
value: 0.888888888888889
|
| 54 |
name: Cosine F1
|
| 55 |
- type: cosine_f1_threshold
|
| 56 |
+
value: 0.652718186378479
|
| 57 |
name: Cosine F1 Threshold
|
| 58 |
- type: cosine_precision
|
| 59 |
value: 1.0
|
| 60 |
name: Cosine Precision
|
| 61 |
- type: cosine_recall
|
| 62 |
+
value: 0.8
|
| 63 |
name: Cosine Recall
|
| 64 |
- type: cosine_ap
|
| 65 |
value: 1.0
|
|
|
|
| 69 |
name: Cosine Mcc
|
| 70 |
---
|
| 71 |
|
| 72 |
+
# SentenceTransformer based on BAAI/bge-base-en-v1.5
|
| 73 |
|
| 74 |
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
| 75 |
|
|
|
|
| 83 |
- **Similarity Function:** Cosine Similarity
|
| 84 |
- **Training Dataset:**
|
| 85 |
- json
|
| 86 |
+
<!-- - **Language:** Unknown -->
|
| 87 |
+
<!-- - **License:** Unknown -->
|
| 88 |
|
| 89 |
### Model Sources
|
| 90 |
|
|
|
|
| 120 |
model = SentenceTransformer("model")
|
| 121 |
# Run inference
|
| 122 |
sentences = [
|
| 123 |
+
'The weather is lovely today.',
|
| 124 |
+
"It's so sunny outside!",
|
| 125 |
+
'He drove to the stadium.',
|
| 126 |
]
|
| 127 |
embeddings = model.encode(sentences)
|
| 128 |
print(embeddings.shape)
|
|
|
|
| 179 |
|
| 180 |
| Metric | Value |
|
| 181 |
|:--------------------------|:--------|
|
| 182 |
+
| cosine_accuracy | 0.8 |
|
| 183 |
+
| cosine_accuracy_threshold | 0.6527 |
|
| 184 |
+
| cosine_f1 | 0.8889 |
|
| 185 |
+
| cosine_f1_threshold | 0.6527 |
|
| 186 |
| cosine_precision | 1.0 |
|
| 187 |
+
| cosine_recall | 0.8 |
|
| 188 |
| **cosine_ap** | **1.0** |
|
| 189 |
| cosine_mcc | 0.0 |
|
| 190 |
|
|
|
|
| 210 |
* Size: 3,696 training samples
|
| 211 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 212 |
* Approximate statistics based on the first 1000 samples:
|
| 213 |
+
| | sentence1 | sentence2 | label |
|
| 214 |
+
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------|
|
| 215 |
+
| type | string | string | int |
|
| 216 |
+
| details | <ul><li>min: 37 tokens</li><li>mean: 40.4 tokens</li><li>max: 44 tokens</li></ul> | <ul><li>min: 49 tokens</li><li>mean: 62.2 tokens</li><li>max: 85 tokens</li></ul> | <ul><li>1: 100.00%</li></ul> |
|
| 217 |
* Samples:
|
| 218 |
| sentence1 | sentence2 | label |
|
| 219 |
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
|
|
|
| 239 |
| | sentence1 | sentence2 | label |
|
| 240 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------|
|
| 241 |
| type | string | string | int |
|
| 242 |
+
| details | <ul><li>min: 24 tokens</li><li>mean: 33.6 tokens</li><li>max: 42 tokens</li></ul> | <ul><li>min: 37 tokens</li><li>mean: 90.4 tokens</li><li>max: 257 tokens</li></ul> | <ul><li>1: 100.00%</li></ul> |
|
| 243 |
* Samples:
|
| 244 |
| sentence1 | sentence2 | label |
|
| 245 |
|:----------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
|
|
|
| 260 |
- `eval_strategy`: epoch
|
| 261 |
- `per_device_train_batch_size`: 2
|
| 262 |
- `per_device_eval_batch_size`: 2
|
| 263 |
+
- `num_train_epochs`: 2
|
| 264 |
- `lr_scheduler_type`: cosine
|
| 265 |
- `warmup_ratio`: 0.1
|
| 266 |
- `bf16`: True
|
| 267 |
- `tf32`: True
|
|
|
|
| 268 |
- `optim`: adamw_torch_fused
|
| 269 |
- `batch_sampler`: no_duplicates
|
| 270 |
|
|
|
|
| 288 |
- `adam_beta2`: 0.999
|
| 289 |
- `adam_epsilon`: 1e-08
|
| 290 |
- `max_grad_norm`: 1.0
|
| 291 |
+
- `num_train_epochs`: 2
|
| 292 |
- `max_steps`: -1
|
| 293 |
- `lr_scheduler_type`: cosine
|
| 294 |
- `lr_scheduler_kwargs`: {}
|
|
|
|
| 328 |
- `disable_tqdm`: False
|
| 329 |
- `remove_unused_columns`: True
|
| 330 |
- `label_names`: None
|
| 331 |
+
- `load_best_model_at_end`: False
|
| 332 |
- `ignore_data_skip`: False
|
| 333 |
- `fsdp`: []
|
| 334 |
- `fsdp_min_num_params`: 0
|
|
|
|
| 390 |
</details>
|
| 391 |
|
| 392 |
### Training Logs
|
| 393 |
+
| Epoch | Step | Validation Loss | EmbeddingSimEval_spearman_cosine | BinaryClassifEval_cosine_ap |
|
| 394 |
+
|:-----:|:----:|:---------------:|:--------------------------------:|:---------------------------:|
|
| 395 |
+
| 1.0 | 3 | 0.2267 | nan | 1.0 |
|
| 396 |
+
| 2.0 | 6 | 0.2448 | nan | 1.0 |
|
| 397 |
|
|
|
|
| 398 |
|
| 399 |
### Framework Versions
|
| 400 |
- Python: 3.11.9
|
eval/binary_classification_evaluation_BinaryClassifEval_results.csv
CHANGED
|
@@ -1,3 +1,4 @@
|
|
| 1 |
epoch,steps,cosine_accuracy,cosine_accuracy_threshold,cosine_f1,cosine_precision,cosine_recall,cosine_f1_threshold,cosine_ap,cosine_mcc
|
| 2 |
-
1.0,
|
| 3 |
-
|
|
|
|
|
|
| 1 |
epoch,steps,cosine_accuracy,cosine_accuracy_threshold,cosine_f1,cosine_precision,cosine_recall,cosine_f1_threshold,cosine_ap,cosine_mcc
|
| 2 |
+
1.0,3,0.8,0.6908704042434692,0.888888888888889,1.0,0.8,0.6908704042434692,1.0,0.0
|
| 3 |
+
2.0,6,0.8,0.652718186378479,0.888888888888889,1.0,0.8,0.652718186378479,1.0,0.0
|
| 4 |
+
2.0,6,0.8,0.652718186378479,0.888888888888889,1.0,0.8,0.652718186378479,1.0,0.0
|
eval/similarity_evaluation_EmbeddingSimEval_results.csv
CHANGED
|
@@ -1,3 +1,4 @@
|
|
| 1 |
epoch,steps,cosine_pearson,cosine_spearman
|
| 2 |
-
1.0,
|
| 3 |
-
|
|
|
|
|
|
| 1 |
epoch,steps,cosine_pearson,cosine_spearman
|
| 2 |
+
1.0,3,nan,nan
|
| 3 |
+
2.0,6,nan,nan
|
| 4 |
+
2.0,6,nan,nan
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 437951328
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9d4df157ba6eca35889b5d0e0fd56134c741af1af6b5a6f3ac6ec5978c0e2f07
|
| 3 |
size 437951328
|
runs/Feb26_12-20-27_algo-1/events.out.tfevents.1740572428.algo-1.63.0
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3d26553c38f72374fc2672775367989f0d8d2c1479a41225b33e037ae8aad02a
|
| 3 |
+
size 6866
|
runs/Feb26_12-20-27_algo-1/events.out.tfevents.1740572446.algo-1.63.1
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d7713644fb9c4d6d476fb44e9a52bc4fbb49e2b5518cf947d717f0bd3f50639b
|
| 3 |
+
size 1166
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3e46ac03d9e38e52891b24ed972ac9dae3fcb9a769a7733626aeb8f746221a10
|
| 3 |
+
size 5560
|