opendatalab
/

MinerU2.5-2509-1.2B

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions

hotelll commited on Sep 26

Commit

3d99352

·

verified ·

1 Parent(s): dd98864

Update README.md

Files changed (1) hide show

README.md +41 -1

README.md CHANGED Viewed

@@ -56,6 +56,8 @@ MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Docum
 # Quick Start
 For convenience, we provide `mineru-vl-utils`, a Python package that simplifies the process of sending requests and handling responses from MinerU2.5 Vision-Language Model. Here we give some examples to use MinerU2.5. For more information and usages, please refer to [mineru-vl-utils](https://github.com/opendatalab/mineru-vl-utils/tree/main).
 ## Install packages
 ```bash
 # For `transformers` backend
@@ -94,7 +96,7 @@ extracted_blocks = client.two_step_extract(image)
 print(extracted_blocks)
 ```
-## `vllm-engine` Example
 ```python
 from vllm import LLM
@@ -117,6 +119,44 @@ extracted_blocks = client.two_step_extract(image)
 print(extracted_blocks)
 ```
 # Model Architecture
 <p align="center">
     <img alt="Image" src="https://hotelll.github.io/MinerU2.5/Mineru25_framework.jpeg"/>

 # Quick Start
 For convenience, we provide `mineru-vl-utils`, a Python package that simplifies the process of sending requests and handling responses from MinerU2.5 Vision-Language Model. Here we give some examples to use MinerU2.5. For more information and usages, please refer to [mineru-vl-utils](https://github.com/opendatalab/mineru-vl-utils/tree/main).
+📌 We strongly recommend using vllm for inference, as the `vllm-async-engine` can achieve a concurrent inference speed of **2.12 fps** on one A100.
 ## Install packages
 ```bash
 # For `transformers` backend
 print(extracted_blocks)
 ```
+## `vllm-engine` Example (Recommended!)
 ```python
 from vllm import LLM
 print(extracted_blocks)
 ```
+## `vllm-async-engine` Example (Recommended!)
+```python
+import io
+import asyncio
+import aiofiles
+from vllm.v1.engine.async_llm import AsyncLLM
+from vllm.engine.arg_utils import AsyncEngineArgs
+from PIL import Image
+from mineru_vl_utils import MinerUClient
+from mineru_vl_utils import MinerULogitsProcessor  # if vllm>=0.10.1
+async_llm = AsyncLLM.from_engine_args(
+    AsyncEngineArgs(
+        model="opendatalab/MinerU2.5-2509-1.2B",
+        logits_processors=[MinerULogitsProcessor]  # if vllm>=0.10.1
+    )
+)
+client = MinerUClient(
+  backend="vllm-async-engine",
+  vllm_async_llm=async_llm,
+)
+async def main():
+    image_path = "/path/to/the/test/image.png"
+    async with aiofiles.open(image_path, "rb") as f:
+        image_data = await f.read()
+    image = Image.open(io.BytesIO(image_data))
+    extracted_blocks = await client.aio_two_step_extract(image)
+    print(extracted_blocks)
+asyncio.run(main())
+async_llm.shutdown()
+```
 # Model Architecture
 <p align="center">
     <img alt="Image" src="https://hotelll.github.io/MinerU2.5/Mineru25_framework.jpeg"/>