mistralai
/

Leanstral-2603

vllm

Model card Files Files and versions

xet

Community

Small edit

by jasonrute - opened Mar 16

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

+57

-206

Files changed (2) hide show

README.md +57 -74
chat_template.jinja +0 -132

README.md CHANGED Viewed

@@ -44,81 +44,50 @@ Leanstral offers these capabilities:
 ### Mistral-Vibe
-Use `Leanstral 119B A6B` with [Mistral Vibe](https://github.com/mistralai/mistral-vibe). Install the latest version (2.5.0):
 ```sh
 uv pip install mistral-vibe --upgrade
-# make sure it's >= 2.5.0
-```
-Leanstral can be added by starting `vibe` and simply running:
-```
-/leanstall
 ```
-This will add `leanstral` as an additional model, add a system prompt (see [LEAD.md](https://huggingface.co/mistralai/Leanstral-2603/blob/main/LEAN.md)) as well as
-ensure `leanstral` can be used as a subagent.
-![Screenshot 2026-03-16 at 18.03.39](https://cdn-uploads.huggingface.co/production/uploads/5dfcb1aada6d0311fd3d5448/Sm_mBI7u4XTjlKGzdXQqe.png)
-Then just press "tab+shift" a couple times until you see the new "lean" mode and `leanstral` model.
-![Screenshot 2026-03-16 at 18.17.04](https://cdn-uploads.huggingface.co/production/uploads/5dfcb1aada6d0311fd3d5448/DHwtKamfj2QfMv0TkJK6G.png)
-**Local server**
-If instead of pinging the Mistral API, you want to use your local vLLM server, you can do the following:
-- 1. Spin up a vllm server as explained in [`Usage - vllm`](#vllm-recommended)
-- 2. Create a new agent file called `lean.toml` in `~/.vibe/agents`:
-```sh
-mkdir ~/.vibe/agents/ && touch ~/.vibe/agents/lean.toml
-```
-And then copy-paste the following config into `~/.vibe/agents/lean.toml`
 ```toml
-display_name = "Lean (local vLLM)"
-description = "Lean 4 mode using local vLLM"
-safety = "neutral"
-system_prompt_id = "lean"
-active_model = "leanstral"
 [[providers]]
 name = "vllm"
 api_base = "http://<your-host-url>:8000/v1"
-api_key_env_var = ""
-backend = "generic"
 reasoning_field_name = "reasoning_content"
 [[models]]
-name = "mistralai/Leanstral-2603"
-provider = "vllm"
 alias = "leanstral"
 thinking = "high"
 temperature = 1.0
-auto_compact_threshold = 168000
-[tools.bash]
-default_timeout = 1200
 ```
-**Note**: Make sure to overwrite `<your-host-url>` with your server's url.
-Then restart `vibe` and "tab-shift" to "lean" mode.
-Give it a try on some "lean" code such as, *e.g.*: [PrimeNumberTheoremAnd](https://github.com/AlexKontorovich/PrimeNumberTheoremAnd)
 ### Local Deployment
 The model can also be deployed with the following libraries, we advise everyone to use the Mistral AI API if the model is subpar with local serving:
 - [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [here](#vllm-recommended).
 - [`transformers`](https://github.com/huggingface/transformers): WIP ⏳ - follow updates on [this PR](https://github.com/huggingface/transformers/pull/44760).
-- [`SGLang`](https://github.com/sgl-project/sglang): WIP ⏳ - follow updates on [this PR](https://github.com/sgl-project/sglang/pull/20708/)
 #### vLLM (recommended)
@@ -126,32 +95,52 @@ We recommend using this model with the [vLLM library](https://github.com/vllm-pr
 **_Installation_**
-1. Make sure to install **vllm nightly**:
-   ```
-   uv pip install -U vllm \
-       --torch-backend=auto \
-       --extra-index-url https://wheels.vllm.ai/nightly
-   ```
-   Doing so should automatically install [`mistral_common >= 1.11.0`](https://github.com/mistralai/mistral-common/releases/tag/v1.11.0).
-   To check:
-   ```
-   python -c "import mistral_common; print(mistral_common.__version__)"
-   ```
-   You can also make use of a ready-to-go [docker image](https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile) or on the [docker hub](https://hub.docker.com/layers/vllm/vllm-openai/nightly).
-2. Install `transformers` from main:
-   ```bash
-   uv pip install git+https://github.com/huggingface/transformers.git
-   ```
 **_Launch server_**
-We recommend that you use Leanstral in a server/client setting.
 ```
 vllm serve mistralai/Leanstral-2603 \
@@ -328,9 +317,3 @@ _Example Tool Calls_:
 `Function(arguments='{"code": "inductive State where\\n  | idle\\n  | busy\\n  | error\\n\\ndef transition : State → State → Bool\\n  | .idle, .busy => true\\n  | .busy, .idle => true\\n  | .busy, .error => true\\n  | _, _ => false\\n\\n#eval transition .idle .busy"}', name='lean_run_code')`
 </details>
-## License
-This model is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.txt).
-*You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.*

 ### Mistral-Vibe
+Use `Leanstral 119B A6B` with [Mistral Vibe](https://github.com/mistralai/mistral-vibe). Install the latest version:
 ```sh
 uv pip install mistral-vibe --upgrade
 ```
+**Add as a provider** In vibe, use the `/leanstall` command.
+**Local server (via vLLM):**
 ```toml
 [[providers]]
 name = "vllm"
 api_base = "http://<your-host-url>:8000/v1"
+reasoning_as_structured_content = true
 reasoning_field_name = "reasoning_content"
 [[models]]
+name = "labs-leanstral-2603"
+provider = "mistral-testing"
 alias = "leanstral"
 thinking = "high"
 temperature = 1.0
 ```
+**System prompt & agent**:
+Add `~/.vibe/prompts/lean.toml` as in [LEAN.md](https://huggingface.co/mistralai/Leanstral-2603/blob/main/LEAN.md) and create `~/.vibe/agents/lean.toml`:
+```toml
+name = "lean"
+display_name = "Lean"
+description = "Specialized mode for Lean 4 code analysis, proof assistance, and theorem proving"
+safety = "neutral"
+agent_type = "agent"
+system_prompt_id = "lean"
+```
+Example repository: [PrimeNumberTheoremAnd](https://github.com/AlexKontorovich/PrimeNumberTheoremAnd)
 ### Local Deployment
 The model can also be deployed with the following libraries, we advise everyone to use the Mistral AI API if the model is subpar with local serving:
 - [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [here](#vllm-recommended).
 - [`transformers`](https://github.com/huggingface/transformers): WIP ⏳ - follow updates on [this PR](https://github.com/huggingface/transformers/pull/44760).
 #### vLLM (recommended)
 **_Installation_**
+> [!Tip]
+> We recommend installing vLLM from our custom Docker image that has fixes for
+> Tool Calling and Reasoning parsing in vLLM and uses the latest version of Transformers.
+> We're working with the vLLM team to merge these fixes to main as soon as possible.
+**_Custom Docker_**
+Make sure to use the following docker image [mistralllm/vllm-ms4:latest](https://hub.docker.com/repository/docker/mistralllm/vllm-ms4/latest/):
+```
+docker pull mistralllm/vllm-ms4:latest
+docker run -it mistralllm/vllm-ms4:latest
+```
+**_Manual Install_**
+If you prefer, you can also manually install `vllm` from this PR: [Add Mistral Guidance](https://github.com/vllm-project/vllm/pull/37081).
+**Note**:
+It is likely that this PR will be split into smaller PRs and merged to `vllm` main in the coming 1-2 weeks (Stand: 16.03.2026).
+Check latest developments directly on the [PR](https://github.com/vllm-project/vllm/pull/37081).
+1. Git clone vLLM:
+```
+git clone --branch fix_mistral_parsing https://github.com/juliendenize/vllm.git
+```
+2. Install with pre-compiled kernels
+```
+VLLM_USE_PRECOMPILED=1 pip install --editable .
+```
+3. Make sure, `transformers` is installed from "main":
+```
+pip install git+https://github.com/huggingface/transformers.git
+```
+Also make sure to have installed [`mistral_common >= 1.10.0`](https://github.com/mistralai/mistral-common/releases/tag/v1.10.0).
+To check:
+```
+python -c "import mistral_common; print(mistral_common.__version__)"
+```
 **_Launch server_**
+We recommand that you use Leanstral in a server/client setting.
 ```
 vllm serve mistralai/Leanstral-2603 \
 `Function(arguments='{"code": "inductive State where\\n  | idle\\n  | busy\\n  | error\\n\\ndef transition : State → State → Bool\\n  | .idle, .busy => true\\n  | .busy, .idle => true\\n  | .busy, .error => true\\n  | _, _ => false\\n\\n#eval transition .idle .busy"}', name='lean_run_code')`
 </details>

chat_template.jinja DELETED Viewed

@@ -1,132 +0,0 @@
-{#- Default system message if no system prompt is passed. #}
-{%- set default_system_message = '' %}
-{#- Begin of sequence token. #}
-{{- '<s>' }}
-{#- Handle system prompt if it exists. #}
-{#- System prompt supports text content or text chunks. #}
-{%- if messages[0]['role'] == 'system' %}
-    {{- '[SYSTEM_PROMPT]' -}}
-    {%- if messages[0]['content'] is string %}
-        {{- messages[0]['content'] -}}
-    {%- else %}
-        {%- for block in messages[0]['content'] %}
-            {%- if block['type'] == 'text' %}
-                {{- block['text'] }}
-            {%- else %}
-                {{- raise_exception('Only text chunks are supported in system message contents.') }}
-            {%- endif %}
-        {%- endfor %}
-    {%- endif %}
-    {{- '[/SYSTEM_PROMPT]' -}}
-    {%- set loop_messages = messages[1:] %}
-{%- else %}
-    {%- set loop_messages = messages %}
-    {%- if default_system_message != '' %}
-        {{- '[SYSTEM_PROMPT]' + default_system_message + '[/SYSTEM_PROMPT]' }}
-    {%- endif %}
-{%- endif %}
-{#- Tools definition #}
-{%- set tools_definition = '' %}
-{%- set has_tools = false %}
-{%- if tools is defined and tools is not none and tools|length > 0 %}
-    {%- set has_tools = true %}
-    {%- set tools_definition = '[AVAILABLE_TOOLS]' + (tools| tojson) + '[/AVAILABLE_TOOLS]' %}
-    {{- tools_definition }}
-{%- endif %}
-{#- Model settings definition #}
-{%- set reasoning_effort = reasoning_effort if reasoning_effort is defined and reasoning_effort is not none else 'none' %}
-{%- if reasoning_effort not in ['none', 'high'] %}
-    {{- raise_exception('reasoning_effort must be either "none" or "high"') }}
-{%- endif %}
-{%- set model_settings = '[MODEL_SETTINGS]{"reasoning_effort": "' + reasoning_effort + '"}[/MODEL_SETTINGS]' %}
-{{- model_settings }}
-{#- Checks for alternating user/assistant messages. #}
-{%- set ns = namespace(index=0) %}
-{%- for message in loop_messages %}
-    {%- if message.role == 'user' or (message.role == 'assistant' and (message.tool_calls is not defined or message.tool_calls is none or message.tool_calls | length == 0)) %}
-        {%- if (message['role'] == 'user') != (ns.index % 2 == 0) %}
-            {{- raise_exception('After the optional system message, conversation roles must alternate user and assistant roles except for tool calls and results.') }}
-        {%- endif %}
-        {%- set ns.index = ns.index + 1 %}
-    {%- endif %}
-{%- endfor %}
-{#- Handle conversation messages. #}
-{%- for message in loop_messages %}
-    {#- User messages supports text content or text and image chunks. #}
-    {%- if message['role'] == 'user' %}
-        {%- if message['content'] is string %}
-            {{- '[INST]' + message['content'] + '[/INST]' }}
-        {%- elif message['content'] | length > 0 %}
-            {{- '[INST]' }}
-            {%- if message['content'] | length == 2 %}
-                {%- set blocks = message['content'] | sort(attribute='type') %}
-            {%- else %}
-                {%- set blocks = message['content'] %}
-            {%- endif %}
-            {%- for block in blocks %}
-                {%- if block['type'] == 'text' %}
-                    {{- block['text'] }}
-                {%- elif block['type'] in ['image', 'image_url'] %}
-                    {{- '[IMG]' }}
-                {%- else %}
-                    {{- raise_exception('Only text, image and image_url chunks are supported in user message content.') }}
-                {%- endif %}
-            {%- endfor %}
-            {{- '[/INST]' }}
-        {%- else %}
-            {{- raise_exception('User message must have a string or a list of chunks in content') }}
-        {%- endif %}
-    {#- Assistant messages supports text content or text, image and thinking chunks. #}
-    {%- elif message['role'] == 'assistant' %}
-        {%- if (message['content'] is none or message['content'] == '' or message['content']|length == 0) and (message['tool_calls'] is not defined or message['tool_calls'] is none or message['tool_calls']|length == 0) %}
-            {{- raise_exception('Assistant message must have a string or a list of chunks in content or a list of tool calls.') }}
-        {%- endif %}
-        {%- if message['content'] is string and message['content'] != '' %}
-            {{- message['content'] }}
-        {%- elif message['content'] | length > 0 %}
-            {%- for block in message['content'] %}
-                {%- if block['type'] == 'text' %}
-                    {{- block['text'] }}
-                {%- elif block['type'] == 'thinking' %}
-                    {{- '[THINK]' + block['thinking'] + '[/THINK]' }}
-                {%- else %}
-                    {{- raise_exception('Only text and thinking chunks are supported in assistant message contents.') }}
-                {%- endif %}
-            {%- endfor %}
-        {%- endif %}
-        {%- if message['tool_calls'] is defined and message['tool_calls'] is not none and message['tool_calls']|length > 0 %}
-            {%- for tool in message['tool_calls'] %}
-                {{- '[TOOL_CALLS]' }}
-                {%- set name = tool['function']['name'] %}
-                {%- set arguments = tool['function']['arguments'] %}
-                {%- if arguments is not string %}
-                    {%- set arguments = arguments|tojson|safe %}
-                {%- elif arguments == '' %}
-                    {%- set arguments = '{}' %}
-                {%- endif %}
-                {{- name + '[ARGS]' + arguments }}
-            {%- endfor %}
-        {%- endif %}
-        {{- '</s>' }}
-    {#- Tool messages only supports text content. #}
-    {%- elif message['role'] == 'tool' %}
-        {{- '[TOOL_RESULTS]' + message['content']|string + '[/TOOL_RESULTS]' }}
-    {#- Raise exception for unsupported roles. #}
-    {%- else %}
-        {{- raise_exception('Only user, assistant and tool roles are supported, got ' + message['role'] + '.') }}
-    {%- endif %}
-{%- endfor %}