Update to use speculators command
#1
by
RelaxingSnorlax
- opened
README.md
CHANGED
|
@@ -31,13 +31,8 @@ This model should be used with the [meta-llama/Llama-4-Maverick-12B-128E-Instruc
|
|
| 31 |
## Use with vLLM
|
| 32 |
|
| 33 |
```bash
|
| 34 |
-
vllm serve
|
| 35 |
-
|
| 36 |
-
--speculative-config '{
|
| 37 |
-
"model": "RedHatAI/Llama-4-Maverick-17B-128E-Instruct-speculators.eagle3",
|
| 38 |
-
"num_speculative_tokens": 3,
|
| 39 |
-
"method": "eagle3"
|
| 40 |
-
}'
|
| 41 |
```
|
| 42 |
|
| 43 |
## Evaluations
|
|
|
|
| 31 |
## Use with vLLM
|
| 32 |
|
| 33 |
```bash
|
| 34 |
+
vllm serve "RedHatAI/Llama-4-Maverick-17B-128E-Instruct-speculators.eagle3" \
|
| 35 |
+
--tensor_parallel_size 8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 |
```
|
| 37 |
|
| 38 |
## Evaluations
|