vllm
Files changed (2) hide show
  1. README.md +57 -74
  2. chat_template.jinja +0 -132
README.md CHANGED
@@ -44,81 +44,50 @@ Leanstral offers these capabilities:
44
 
45
  ### Mistral-Vibe
46
 
47
- Use `Leanstral 119B A6B` with [Mistral Vibe](https://github.com/mistralai/mistral-vibe). Install the latest version (2.5.0):
48
 
49
  ```sh
50
  uv pip install mistral-vibe --upgrade
51
-
52
- # make sure it's >= 2.5.0
53
- ```
54
-
55
- Leanstral can be added by starting `vibe` and simply running:
56
-
57
- ```
58
- /leanstall
59
  ```
60
 
61
- This will add `leanstral` as an additional model, add a system prompt (see [LEAD.md](https://huggingface.co/mistralai/Leanstral-2603/blob/main/LEAN.md)) as well as
62
- ensure `leanstral` can be used as a subagent.
63
-
64
-
65
- ![Screenshot 2026-03-16 at 18.03.39](https://cdn-uploads.huggingface.co/production/uploads/5dfcb1aada6d0311fd3d5448/Sm_mBI7u4XTjlKGzdXQqe.png)
66
-
67
- Then just press "tab+shift" a couple times until you see the new "lean" mode and `leanstral` model.
68
-
69
- ![Screenshot 2026-03-16 at 18.17.04](https://cdn-uploads.huggingface.co/production/uploads/5dfcb1aada6d0311fd3d5448/DHwtKamfj2QfMv0TkJK6G.png)
70
-
71
- **Local server**
72
-
73
- If instead of pinging the Mistral API, you want to use your local vLLM server, you can do the following:
74
- - 1. Spin up a vllm server as explained in [`Usage - vllm`](#vllm-recommended)
75
- - 2. Create a new agent file called `lean.toml` in `~/.vibe/agents`:
76
 
77
- ```sh
78
- mkdir ~/.vibe/agents/ && touch ~/.vibe/agents/lean.toml
79
- ```
80
-
81
- And then copy-paste the following config into `~/.vibe/agents/lean.toml`
82
 
83
  ```toml
84
- display_name = "Lean (local vLLM)"
85
- description = "Lean 4 mode using local vLLM"
86
- safety = "neutral"
87
-
88
- system_prompt_id = "lean"
89
- active_model = "leanstral"
90
-
91
  [[providers]]
92
  name = "vllm"
93
  api_base = "http://<your-host-url>:8000/v1"
94
- api_key_env_var = ""
95
- backend = "generic"
96
  reasoning_field_name = "reasoning_content"
97
-
98
  [[models]]
99
- name = "mistralai/Leanstral-2603"
100
- provider = "vllm"
101
  alias = "leanstral"
102
  thinking = "high"
103
  temperature = 1.0
104
- auto_compact_threshold = 168000
105
-
106
- [tools.bash]
107
- default_timeout = 1200
108
  ```
109
 
110
- **Note**: Make sure to overwrite `<your-host-url>` with your server's url.
111
 
112
- Then restart `vibe` and "tab-shift" to "lean" mode.
 
 
 
 
 
 
 
 
 
113
 
114
- Give it a try on some "lean" code such as, *e.g.*: [PrimeNumberTheoremAnd](https://github.com/AlexKontorovich/PrimeNumberTheoremAnd)
115
 
116
  ### Local Deployment
117
 
118
  The model can also be deployed with the following libraries, we advise everyone to use the Mistral AI API if the model is subpar with local serving:
119
  - [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [here](#vllm-recommended).
120
  - [`transformers`](https://github.com/huggingface/transformers): WIP ⏳ - follow updates on [this PR](https://github.com/huggingface/transformers/pull/44760).
121
- - [`SGLang`](https://github.com/sgl-project/sglang): WIP ⏳ - follow updates on [this PR](https://github.com/sgl-project/sglang/pull/20708/)
122
 
123
  #### vLLM (recommended)
124
 
@@ -126,32 +95,52 @@ We recommend using this model with the [vLLM library](https://github.com/vllm-pr
126
 
127
  **_Installation_**
128
 
129
- 1. Make sure to install **vllm nightly**:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
130
 
131
- ```
132
- uv pip install -U vllm \
133
- --torch-backend=auto \
134
- --extra-index-url https://wheels.vllm.ai/nightly
135
- ```
136
 
137
- Doing so should automatically install [`mistral_common >= 1.11.0`](https://github.com/mistralai/mistral-common/releases/tag/v1.11.0).
138
-
139
- To check:
140
- ```
141
- python -c "import mistral_common; print(mistral_common.__version__)"
142
- ```
143
-
144
- You can also make use of a ready-to-go [docker image](https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile) or on the [docker hub](https://hub.docker.com/layers/vllm/vllm-openai/nightly).
145
 
146
- 2. Install `transformers` from main:
 
 
 
147
 
148
- ```bash
149
- uv pip install git+https://github.com/huggingface/transformers.git
150
- ```
 
 
151
 
152
  **_Launch server_**
153
 
154
- We recommend that you use Leanstral in a server/client setting.
155
 
156
  ```
157
  vllm serve mistralai/Leanstral-2603 \
@@ -328,9 +317,3 @@ _Example Tool Calls_:
328
  `Function(arguments='{"code": "inductive State where\\n | idle\\n | busy\\n | error\\n\\ndef transition : State → State → Bool\\n | .idle, .busy => true\\n | .busy, .idle => true\\n | .busy, .error => true\\n | _, _ => false\\n\\n#eval transition .idle .busy"}', name='lean_run_code')`
329
 
330
  </details>
331
-
332
- ## License
333
-
334
- This model is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.txt).
335
-
336
- *You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.*
 
44
 
45
  ### Mistral-Vibe
46
 
47
+ Use `Leanstral 119B A6B` with [Mistral Vibe](https://github.com/mistralai/mistral-vibe). Install the latest version:
48
 
49
  ```sh
50
  uv pip install mistral-vibe --upgrade
 
 
 
 
 
 
 
 
51
  ```
52
 
53
+ **Add as a provider** In vibe, use the `/leanstall` command.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
+ **Local server (via vLLM):**
 
 
 
 
56
 
57
  ```toml
 
 
 
 
 
 
 
58
  [[providers]]
59
  name = "vllm"
60
  api_base = "http://<your-host-url>:8000/v1"
61
+ reasoning_as_structured_content = true
 
62
  reasoning_field_name = "reasoning_content"
 
63
  [[models]]
64
+ name = "labs-leanstral-2603"
65
+ provider = "mistral-testing"
66
  alias = "leanstral"
67
  thinking = "high"
68
  temperature = 1.0
 
 
 
 
69
  ```
70
 
71
+ **System prompt & agent**:
72
 
73
+ Add `~/.vibe/prompts/lean.toml` as in [LEAN.md](https://huggingface.co/mistralai/Leanstral-2603/blob/main/LEAN.md) and create `~/.vibe/agents/lean.toml`:
74
+
75
+ ```toml
76
+ name = "lean"
77
+ display_name = "Lean"
78
+ description = "Specialized mode for Lean 4 code analysis, proof assistance, and theorem proving"
79
+ safety = "neutral"
80
+ agent_type = "agent"
81
+ system_prompt_id = "lean"
82
+ ```
83
 
84
+ Example repository: [PrimeNumberTheoremAnd](https://github.com/AlexKontorovich/PrimeNumberTheoremAnd)
85
 
86
  ### Local Deployment
87
 
88
  The model can also be deployed with the following libraries, we advise everyone to use the Mistral AI API if the model is subpar with local serving:
89
  - [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [here](#vllm-recommended).
90
  - [`transformers`](https://github.com/huggingface/transformers): WIP ⏳ - follow updates on [this PR](https://github.com/huggingface/transformers/pull/44760).
 
91
 
92
  #### vLLM (recommended)
93
 
 
95
 
96
  **_Installation_**
97
 
98
+ > [!Tip]
99
+ > We recommend installing vLLM from our custom Docker image that has fixes for
100
+ > Tool Calling and Reasoning parsing in vLLM and uses the latest version of Transformers.
101
+ > We're working with the vLLM team to merge these fixes to main as soon as possible.
102
+
103
+ **_Custom Docker_**
104
+
105
+ Make sure to use the following docker image [mistralllm/vllm-ms4:latest](https://hub.docker.com/repository/docker/mistralllm/vllm-ms4/latest/):
106
+
107
+ ```
108
+ docker pull mistralllm/vllm-ms4:latest
109
+ docker run -it mistralllm/vllm-ms4:latest
110
+ ```
111
+
112
+ **_Manual Install_**
113
+
114
+ If you prefer, you can also manually install `vllm` from this PR: [Add Mistral Guidance](https://github.com/vllm-project/vllm/pull/37081).
115
+
116
+ **Note**:
117
+ It is likely that this PR will be split into smaller PRs and merged to `vllm` main in the coming 1-2 weeks (Stand: 16.03.2026).
118
+ Check latest developments directly on the [PR](https://github.com/vllm-project/vllm/pull/37081).
119
 
120
+ 1. Git clone vLLM:
121
+ ```
122
+ git clone --branch fix_mistral_parsing https://github.com/juliendenize/vllm.git
123
+ ```
 
124
 
125
+ 2. Install with pre-compiled kernels
126
+ ```
127
+ VLLM_USE_PRECOMPILED=1 pip install --editable .
128
+ ```
 
 
 
 
129
 
130
+ 3. Make sure, `transformers` is installed from "main":
131
+ ```
132
+ pip install git+https://github.com/huggingface/transformers.git
133
+ ```
134
 
135
+ Also make sure to have installed [`mistral_common >= 1.10.0`](https://github.com/mistralai/mistral-common/releases/tag/v1.10.0).
136
+ To check:
137
+ ```
138
+ python -c "import mistral_common; print(mistral_common.__version__)"
139
+ ```
140
 
141
  **_Launch server_**
142
 
143
+ We recommand that you use Leanstral in a server/client setting.
144
 
145
  ```
146
  vllm serve mistralai/Leanstral-2603 \
 
317
  `Function(arguments='{"code": "inductive State where\\n | idle\\n | busy\\n | error\\n\\ndef transition : State → State → Bool\\n | .idle, .busy => true\\n | .busy, .idle => true\\n | .busy, .error => true\\n | _, _ => false\\n\\n#eval transition .idle .busy"}', name='lean_run_code')`
318
 
319
  </details>
 
 
 
 
 
 
chat_template.jinja DELETED
@@ -1,132 +0,0 @@
1
- {#- Default system message if no system prompt is passed. #}
2
- {%- set default_system_message = '' %}
3
-
4
- {#- Begin of sequence token. #}
5
- {{- '<s>' }}
6
-
7
- {#- Handle system prompt if it exists. #}
8
- {#- System prompt supports text content or text chunks. #}
9
- {%- if messages[0]['role'] == 'system' %}
10
- {{- '[SYSTEM_PROMPT]' -}}
11
- {%- if messages[0]['content'] is string %}
12
- {{- messages[0]['content'] -}}
13
- {%- else %}
14
- {%- for block in messages[0]['content'] %}
15
- {%- if block['type'] == 'text' %}
16
- {{- block['text'] }}
17
- {%- else %}
18
- {{- raise_exception('Only text chunks are supported in system message contents.') }}
19
- {%- endif %}
20
- {%- endfor %}
21
- {%- endif %}
22
- {{- '[/SYSTEM_PROMPT]' -}}
23
- {%- set loop_messages = messages[1:] %}
24
- {%- else %}
25
- {%- set loop_messages = messages %}
26
- {%- if default_system_message != '' %}
27
- {{- '[SYSTEM_PROMPT]' + default_system_message + '[/SYSTEM_PROMPT]' }}
28
- {%- endif %}
29
- {%- endif %}
30
-
31
-
32
- {#- Tools definition #}
33
- {%- set tools_definition = '' %}
34
- {%- set has_tools = false %}
35
- {%- if tools is defined and tools is not none and tools|length > 0 %}
36
- {%- set has_tools = true %}
37
- {%- set tools_definition = '[AVAILABLE_TOOLS]' + (tools| tojson) + '[/AVAILABLE_TOOLS]' %}
38
- {{- tools_definition }}
39
- {%- endif %}
40
-
41
- {#- Model settings definition #}
42
- {%- set reasoning_effort = reasoning_effort if reasoning_effort is defined and reasoning_effort is not none else 'none' %}
43
- {%- if reasoning_effort not in ['none', 'high'] %}
44
- {{- raise_exception('reasoning_effort must be either "none" or "high"') }}
45
- {%- endif %}
46
- {%- set model_settings = '[MODEL_SETTINGS]{"reasoning_effort": "' + reasoning_effort + '"}[/MODEL_SETTINGS]' %}
47
- {{- model_settings }}
48
-
49
- {#- Checks for alternating user/assistant messages. #}
50
- {%- set ns = namespace(index=0) %}
51
- {%- for message in loop_messages %}
52
- {%- if message.role == 'user' or (message.role == 'assistant' and (message.tool_calls is not defined or message.tool_calls is none or message.tool_calls | length == 0)) %}
53
- {%- if (message['role'] == 'user') != (ns.index % 2 == 0) %}
54
- {{- raise_exception('After the optional system message, conversation roles must alternate user and assistant roles except for tool calls and results.') }}
55
- {%- endif %}
56
- {%- set ns.index = ns.index + 1 %}
57
- {%- endif %}
58
- {%- endfor %}
59
-
60
- {#- Handle conversation messages. #}
61
- {%- for message in loop_messages %}
62
-
63
- {#- User messages supports text content or text and image chunks. #}
64
- {%- if message['role'] == 'user' %}
65
- {%- if message['content'] is string %}
66
- {{- '[INST]' + message['content'] + '[/INST]' }}
67
- {%- elif message['content'] | length > 0 %}
68
- {{- '[INST]' }}
69
- {%- if message['content'] | length == 2 %}
70
- {%- set blocks = message['content'] | sort(attribute='type') %}
71
- {%- else %}
72
- {%- set blocks = message['content'] %}
73
- {%- endif %}
74
- {%- for block in blocks %}
75
- {%- if block['type'] == 'text' %}
76
- {{- block['text'] }}
77
- {%- elif block['type'] in ['image', 'image_url'] %}
78
- {{- '[IMG]' }}
79
- {%- else %}
80
- {{- raise_exception('Only text, image and image_url chunks are supported in user message content.') }}
81
- {%- endif %}
82
- {%- endfor %}
83
- {{- '[/INST]' }}
84
- {%- else %}
85
- {{- raise_exception('User message must have a string or a list of chunks in content') }}
86
- {%- endif %}
87
-
88
- {#- Assistant messages supports text content or text, image and thinking chunks. #}
89
- {%- elif message['role'] == 'assistant' %}
90
- {%- if (message['content'] is none or message['content'] == '' or message['content']|length == 0) and (message['tool_calls'] is not defined or message['tool_calls'] is none or message['tool_calls']|length == 0) %}
91
- {{- raise_exception('Assistant message must have a string or a list of chunks in content or a list of tool calls.') }}
92
- {%- endif %}
93
-
94
- {%- if message['content'] is string and message['content'] != '' %}
95
- {{- message['content'] }}
96
- {%- elif message['content'] | length > 0 %}
97
- {%- for block in message['content'] %}
98
- {%- if block['type'] == 'text' %}
99
- {{- block['text'] }}
100
- {%- elif block['type'] == 'thinking' %}
101
- {{- '[THINK]' + block['thinking'] + '[/THINK]' }}
102
- {%- else %}
103
- {{- raise_exception('Only text and thinking chunks are supported in assistant message contents.') }}
104
- {%- endif %}
105
- {%- endfor %}
106
- {%- endif %}
107
-
108
- {%- if message['tool_calls'] is defined and message['tool_calls'] is not none and message['tool_calls']|length > 0 %}
109
- {%- for tool in message['tool_calls'] %}
110
- {{- '[TOOL_CALLS]' }}
111
- {%- set name = tool['function']['name'] %}
112
- {%- set arguments = tool['function']['arguments'] %}
113
- {%- if arguments is not string %}
114
- {%- set arguments = arguments|tojson|safe %}
115
- {%- elif arguments == '' %}
116
- {%- set arguments = '{}' %}
117
- {%- endif %}
118
- {{- name + '[ARGS]' + arguments }}
119
- {%- endfor %}
120
- {%- endif %}
121
-
122
- {{- '</s>' }}
123
-
124
- {#- Tool messages only supports text content. #}
125
- {%- elif message['role'] == 'tool' %}
126
- {{- '[TOOL_RESULTS]' + message['content']|string + '[/TOOL_RESULTS]' }}
127
-
128
- {#- Raise exception for unsupported roles. #}
129
- {%- else %}
130
- {{- raise_exception('Only user, assistant and tool roles are supported, got ' + message['role'] + '.') }}
131
- {%- endif %}
132
- {%- endfor %}