RedHatAI
/

Llama4-Maverick-17B-128E-Instruct-speculator.eagle3

speculative-decoding

Model card Files Files and versions

Update to use speculators command

#1

by RelaxingSnorlax - opened 19 days ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +2 -7

README.md CHANGED Viewed

@@ -31,13 +31,8 @@ This model should be used with the [meta-llama/Llama-4-Maverick-12B-128E-Instruc
 ## Use with vLLM
 ```bash
-vllm serve meta-llama/Llama-4-Maverick-17B-128E-Instruct \
-  -tp 8 \
-  --speculative-config '{
-    "model": "RedHatAI/Llama-4-Maverick-17B-128E-Instruct-speculators.eagle3",
-    "num_speculative_tokens": 3,
-    "method": "eagle3"
-  }'
 ```
 ## Evaluations

 ## Use with vLLM
 ```bash
+vllm serve "RedHatAI/Llama-4-Maverick-17B-128E-Instruct-speculators.eagle3" \
+  --tensor_parallel_size 8
 ```
 ## Evaluations