thxCode commited on
Commit
c1696c3
·
0 Parent(s):

feat: first commit

Browse files

Signed-off-by: thxCode <[email protected]>

.gitattributes ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ *.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,170 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-classification
4
+ tags:
5
+ - transformers
6
+ - sentence-transformers
7
+ - text-embeddings-inference
8
+ language:
9
+ - af
10
+ - ar
11
+ - az
12
+ - be
13
+ - bg
14
+ - bn
15
+ - ca
16
+ - ceb
17
+ - cs
18
+ - cy
19
+ - da
20
+ - de
21
+ - el
22
+ - en
23
+ - es
24
+ - et
25
+ - eu
26
+ - fa
27
+ - fi
28
+ - fr
29
+ - gl
30
+ - gu
31
+ - he
32
+ - hi
33
+ - hr
34
+ - ht
35
+ - hu
36
+ - hy
37
+ - id
38
+ - is
39
+ - it
40
+ - ja
41
+ - jv
42
+ - ka
43
+ - kk
44
+ - km
45
+ - kn
46
+ - ko
47
+ - ky
48
+ - lo
49
+ - lt
50
+ - lv
51
+ - mk
52
+ - ml
53
+ - mn
54
+ - mr
55
+ - ms
56
+ - my
57
+ - ne
58
+ - nl
59
+ - 'no'
60
+ - pa
61
+ - pl
62
+ - pt
63
+ - qu
64
+ - ro
65
+ - ru
66
+ - si
67
+ - sk
68
+ - sl
69
+ - so
70
+ - sq
71
+ - sr
72
+ - sv
73
+ - sw
74
+ - ta
75
+ - te
76
+ - th
77
+ - tl
78
+ - tr
79
+ - uk
80
+ - ur
81
+ - vi
82
+ - yo
83
+ - zh
84
+ ---
85
+
86
+ # gte-multilingual-reranker-base-GGUF
87
+
88
+ **!!! Experimental supported by [gpustack/llama-box v0.0.72+](https://github.com/gpustack/llama-box) only !!!**<br/>
89
+
90
+ **Model creator**: [Alibaba-NLP](https://huggingface.co/Alibaba-NLP)<br/>
91
+ **Original model**: [gte-multilingual-reranker-base](https://huggingface.co/Alibaba-NLP/gte-multilingual-reranker-base)<br/>
92
+ **GGUF quantization**: based on llama.cpp [f4d2b](https://github.com/ggerganov/llama.cpp/commit/f4d2b8846a6b34419ff9e9491aee6cd95e444bfc) that patched by llama-box
93
+
94
+ ## gte-multilingual-reranker-base
95
+
96
+ The **gte-multilingual-reranker-base** model is the first reranker model in the [GTE](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469) family of models, featuring several key attributes:
97
+ - **High Performance**: Achieves state-of-the-art (SOTA) results in multilingual retrieval tasks and multi-task representation model evaluations when compared to reranker models of similar size.
98
+ - **Training Architecture**: Trained using an encoder-only transformers architecture, resulting in a smaller model size. Unlike previous models based on decode-only LLM architecture (e.g., gte-qwen2-1.5b-instruct), this model has lower hardware requirements for inference, offering a 10x increase in inference speed.
99
+ - **Long Context**: Supports text lengths up to **8192** tokens.
100
+ - **Multilingual Capability**: Supports over **70** languages.
101
+
102
+
103
+ ## Model Information
104
+ - Model Size: 306M
105
+ - Max Input Tokens: 8192
106
+
107
+
108
+ ### Usage
109
+ - **It is recommended to install xformers and enable unpadding for acceleration,
110
+ refer to [enable-unpadding-and-xformers](https://huggingface.co/Alibaba-NLP/new-impl#recommendation-enable-unpadding-and-acceleration-with-xformers).**
111
+ - **How to use it offline: [new-impl/discussions/2](https://huggingface.co/Alibaba-NLP/new-impl/discussions/2#662b08d04d8c3d0a09c88fa3)**
112
+
113
+
114
+ Using Huggingface transformers (transformers>=4.36.0)
115
+ ```
116
+ import torch
117
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
118
+
119
+ model_name_or_path = "Alibaba-NLP/gte-multilingual-reranker-base"
120
+
121
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
122
+ model = AutoModelForSequenceClassification.from_pretrained(
123
+ model_name_or_path, trust_remote_code=True,
124
+ torch_dtype=torch.float16
125
+ )
126
+ model.eval()
127
+
128
+ pairs = [["中国的首都在哪儿","北京"], ["what is the capital of China?", "北京"], ["how to implement quick sort in python?","Introduction of quick sort"]]
129
+ with torch.no_grad():
130
+ inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
131
+ scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
132
+ print(scores)
133
+
134
+ # tensor([1.2315, 0.5923, 0.3041])
135
+ ```
136
+
137
+
138
+ ## Evaluation
139
+
140
+ Results of reranking based on multiple text retreival datasets
141
+
142
+ ![image](./images/mgte-reranker.png)
143
+
144
+ **More detailed experimental results can be found in the [paper](https://arxiv.org/pdf/2407.19669)**.
145
+
146
+ ## Cloud API Services
147
+
148
+ In addition to the open-source [GTE](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469) series models, GTE series models are also available as commercial API services on Alibaba Cloud.
149
+
150
+ - [Embedding Models](https://help.aliyun.com/zh/model-studio/developer-reference/general-text-embedding/): Rhree versions of the text embedding models are available: text-embedding-v1/v2/v3, with v3 being the latest API service.
151
+ - [ReRank Models](https://help.aliyun.com/zh/model-studio/developer-reference/general-text-sorting-model/): The gte-rerank model service is available.
152
+
153
+ Note that the models behind the commercial APIs are not entirely identical to the open-source models.
154
+
155
+
156
+ ## Citation
157
+
158
+ If you find our paper or models helpful, please consider cite:
159
+
160
+ ```
161
+ @misc{zhang2024mgtegeneralizedlongcontexttext,
162
+ title={mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval},
163
+ author={Xin Zhang and Yanzhao Zhang and Dingkun Long and Wen Xie and Ziqi Dai and Jialong Tang and Huan Lin and Baosong Yang and Pengjun Xie and Fei Huang and Meishan Zhang and Wenjie Li and Min Zhang},
164
+ year={2024},
165
+ eprint={2407.19669},
166
+ archivePrefix={arXiv},
167
+ primaryClass={cs.CL},
168
+ url={https://arxiv.org/abs/2407.19669},
169
+ }
170
+ ```
gte-multilingual-reranker-base-FP16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8034ddccba0fb2690f5e7b4137482f5d183aa1055e16b45fdee10f016d8d4624
3
+ size 621536672
gte-multilingual-reranker-base-Q2_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:76a1d2e07272d6920cc0fc1bf159713c6b74932bf3be1ba5aaff8d3296e0ef21
3
+ size 208267808
gte-multilingual-reranker-base-Q3_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:333cb0f13c5cae67a36eaf0bcbead4fa2ef2432a7e4acc0408e1be8cdff522cb
3
+ size 223755296
gte-multilingual-reranker-base-Q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6dfb19751a64ff7babd7e22d655ad4dedda61bc3a8aad000b41b88ee0ba6ac13
3
+ size 231353888
gte-multilingual-reranker-base-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3cb2d613d39dbefabc4af5cd03fadf1502719b8a75d5acda27821dc1455f215d
3
+ size 237657632
gte-multilingual-reranker-base-Q5_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d54b27aca8f48429145c7f7829f22ae2705bd13d75cc55eb81c1c261d728e578
3
+ size 245583392
gte-multilingual-reranker-base-Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dceaf0ab2eba92fd3f7569731ae3a961edafd0d0e924979bcd0fb149f8a9cf1e
3
+ size 250283552
gte-multilingual-reranker-base-Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6d838a92f36072514b8ff3695cadc601ece963655eb50dc1a9fa8b4c5cf15d7a
3
+ size 260702240
gte-multilingual-reranker-base-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:51c1df6a8830b219e6fffb5ddac85368198dbfaf6dca1a251caf0d1da14557c3
3
+ size 334780832
images/mgte-reranker.png ADDED