[AUTO] CVST Tokenizer Badger
Browse filesA scripted PR to update the status of the transformer tokenizer.
```
> [!CAUTION]
> ⚠️
> The `transformers` tokenizer might give incorrect results as it has not been tested by the Mistral team. To make sure that your encoding and decoding is correct, please use `mistral_common` as shown below:
## Encode and Decode with `mistral_common`
```py
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest
mistral_models_path = "MISTRAL_MODELS_PATH"
tokenizer = MistralTokenizer.v3()
completion_request = ChatCompletionRequest(messages=[UserMessage(content="Explain Machine Learning to me in a nutshell.")])
tokens = tokenizer.encode_chat_completion(completion_request).tokens
```
## Inference with `mistral_inference`
```py
from mistral_inference.model import Transformer
from mistral_inference.generate import generate
model = Transformer.from_folder(mistral_models_path)
out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.decode(out_tokens[0])
print(result)
```
## Inference with hugging face `transformers`
```py
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1")
model.to("cuda")
generated_ids = model.generate(tokens, max_new_tokens=1000, do_sample=True)
# decode with mistral tokenizer
result = tokenizer.decode(generated_ids[0].tolist())
print(result)
```
> [!TIP]
> PRs to correct the `transformers` tokenizer so that it gives 1-to-1 the same results as the `mistral_common` reference implementation are very welcome!
```
|
@@ -9,6 +9,62 @@ language:
|
|
| 9 |
---
|
| 10 |
|
| 11 |
# Model Card for Mixtral-8x22B-Instruct-v0.1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
The Mixtral-8x22B-Instruct-v0.1 Large Language Model (LLM) is an instruct fine-tuned version of the [Mixtral-8x22B-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-v0.1).
|
| 13 |
|
| 14 |
## Run the model
|
|
|
|
| 9 |
---
|
| 10 |
|
| 11 |
# Model Card for Mixtral-8x22B-Instruct-v0.1
|
| 12 |
+
|
| 13 |
+
###
|
| 14 |
+
|
| 15 |
+
> [!CAUTION]
|
| 16 |
+
> ⚠️
|
| 17 |
+
> The `transformers` tokenizer might give incorrect results as it has not been tested by the Mistral team. To make sure that your encoding and decoding is correct, please use `mistral_common` as shown below:
|
| 18 |
+
|
| 19 |
+
## Encode and Decode with `mistral_common`
|
| 20 |
+
|
| 21 |
+
```py
|
| 22 |
+
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
|
| 23 |
+
from mistral_common.protocol.instruct.messages import UserMessage
|
| 24 |
+
from mistral_common.protocol.instruct.request import ChatCompletionRequest
|
| 25 |
+
|
| 26 |
+
mistral_models_path = "MISTRAL_MODELS_PATH"
|
| 27 |
+
|
| 28 |
+
tokenizer = MistralTokenizer.v3()
|
| 29 |
+
|
| 30 |
+
completion_request = ChatCompletionRequest(messages=[UserMessage(content="Explain Machine Learning to me in a nutshell.")])
|
| 31 |
+
|
| 32 |
+
tokens = tokenizer.encode_chat_completion(completion_request).tokens
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
## Inference with `mistral_inference`
|
| 36 |
+
|
| 37 |
+
```py
|
| 38 |
+
from mistral_inference.model import Transformer
|
| 39 |
+
from mistral_inference.generate import generate
|
| 40 |
+
|
| 41 |
+
model = Transformer.from_folder(mistral_models_path)
|
| 42 |
+
out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
|
| 43 |
+
|
| 44 |
+
result = tokenizer.decode(out_tokens[0])
|
| 45 |
+
|
| 46 |
+
print(result)
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
## Inference with hugging face `transformers`
|
| 50 |
+
|
| 51 |
+
```py
|
| 52 |
+
from transformers import AutoModelForCausalLM
|
| 53 |
+
|
| 54 |
+
model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x22B-Instruct-v0.1")
|
| 55 |
+
model.to("cuda")
|
| 56 |
+
|
| 57 |
+
generated_ids = model.generate(tokens, max_new_tokens=1000, do_sample=True)
|
| 58 |
+
|
| 59 |
+
# decode with mistral tokenizer
|
| 60 |
+
result = tokenizer.decode(generated_ids[0].tolist())
|
| 61 |
+
print(result)
|
| 62 |
+
```
|
| 63 |
+
|
| 64 |
+
> [!TIP]
|
| 65 |
+
> PRs to correct the `transformers` tokenizer so that it gives 1-to-1 the same results as the `mistral_common` reference implementation are very welcome!
|
| 66 |
+
|
| 67 |
+
---
|
| 68 |
The Mixtral-8x22B-Instruct-v0.1 Large Language Model (LLM) is an instruct fine-tuned version of the [Mixtral-8x22B-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-v0.1).
|
| 69 |
|
| 70 |
## Run the model
|