rvo commited on
Commit
f242030
·
verified ·
1 Parent(s): cdf0a7b

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -4
README.md CHANGED
@@ -20,11 +20,13 @@ language:
20
  </div>
21
  </div>
22
 
 
 
23
  **mdbr-leaf-ir** is a compact high-performance text embedding model specifically designed for **information retrieval (IR)** tasks.
24
 
25
  Enabling even greater efficiency, `mdbr-leaf-ir` supports [flexible asymmetric architectures](#asymmetric-retrieval-setup) and is robust to [vector quantization](#vector-quantization) and [MRL truncation](#mrl).
26
 
27
- If you are looking to perform other tasks such as classification, clustering, semantic sentence similarity, summarization, please check out our [`mdb-leaf-mt`](https://huggingface.co/MongoDB/mdb-leaf-mt) model.
28
 
29
  **Note**: this model has been developed by MongoDB Research and is not part of MongoDB's commercial offerings.
30
 
@@ -39,8 +41,7 @@ A technical report detailing our proposed `LEAF` training procedure is [availabl
39
  * **MRL and quantization support**: embedding vectors generated by `mdbr-leaf-ir` compress well when truncated (MRL) and/or are stored using more efficient types like `int8` and `binary`. [See below](#mrl) for more information.
40
 
41
 
42
- ## Performance
43
-
44
  ### Benchmark Results
45
 
46
  * Values are nDCG@10
@@ -58,7 +59,8 @@ A technical report detailing our proposed `LEAF` training procedure is [availabl
58
  | `BM25` | -- | 40.8 | 23.8 | 31.8 | 15.0 | 67.6 | 78.7 | 58.9 | 30.5 | 63.8 | 16.2 | 31.9 | 62.9 | 43.5 |
59
  | `SPLADE v2` | 110M | 47.9 | 33.6 | 33.4 | 15.8 | 69.3 | 83.8 | 71.0 | 52.1 | 78.6 | 23.5 | 43.5 | **68.4** | 51.7 |
60
  | `ColBERT v2` | 110M | 46.3 | 35.6 | 33.8 | 15.4 | 69.3 | 85.2 | 73.8 | 56.2 | 78.5 | 17.6 | **44.6** | 66.7 | 51.9 |
61
-
 
62
  ## Quickstart
63
 
64
  ### Sentence Transformers
@@ -250,6 +252,10 @@ print(f"* Similarities:\n{similarities}")
250
  # [ 76174 99127]]
251
  ```
252
 
 
 
 
 
253
 
254
  ## Citation
255
 
 
20
  </div>
21
  </div>
22
 
23
+ ## Introduction
24
+
25
  **mdbr-leaf-ir** is a compact high-performance text embedding model specifically designed for **information retrieval (IR)** tasks.
26
 
27
  Enabling even greater efficiency, `mdbr-leaf-ir` supports [flexible asymmetric architectures](#asymmetric-retrieval-setup) and is robust to [vector quantization](#vector-quantization) and [MRL truncation](#mrl).
28
 
29
+ If you are looking to perform other tasks such as classification, clustering, semantic sentence similarity, summarization, please check out our [`mdbr-leaf-mt`](https://huggingface.co/MongoDB/mdbr-leaf-mt) model.
30
 
31
  **Note**: this model has been developed by MongoDB Research and is not part of MongoDB's commercial offerings.
32
 
 
41
  * **MRL and quantization support**: embedding vectors generated by `mdbr-leaf-ir` compress well when truncated (MRL) and/or are stored using more efficient types like `int8` and `binary`. [See below](#mrl) for more information.
42
 
43
 
44
+ <!-- ## Performance
 
45
  ### Benchmark Results
46
 
47
  * Values are nDCG@10
 
59
  | `BM25` | -- | 40.8 | 23.8 | 31.8 | 15.0 | 67.6 | 78.7 | 58.9 | 30.5 | 63.8 | 16.2 | 31.9 | 62.9 | 43.5 |
60
  | `SPLADE v2` | 110M | 47.9 | 33.6 | 33.4 | 15.8 | 69.3 | 83.8 | 71.0 | 52.1 | 78.6 | 23.5 | 43.5 | **68.4** | 51.7 |
61
  | `ColBERT v2` | 110M | 46.3 | 35.6 | 33.8 | 15.4 | 69.3 | 85.2 | 73.8 | 56.2 | 78.5 | 17.6 | **44.6** | 66.7 | 51.9 |
62
+ -->
63
+
64
  ## Quickstart
65
 
66
  ### Sentence Transformers
 
252
  # [ 76174 99127]]
253
  ```
254
 
255
+ ## Evaluation
256
+
257
+ Please refer to this <span style="color:red">TBD</span> script to replicate results (standard and asymmetric mode).
258
+ The checkpoint used to produce the scores presented in the paper [is here](https://huggingface.co/MongoDB/mdbr-leaf-ir/commit/ea98995e96beac21b820aa8ad9afaa6fd29b243d).
259
 
260
  ## Citation
261