|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- ibm-granite/granite-docling-258M |
|
|
pipeline_tag: image-text-to-text |
|
|
tags: |
|
|
- gguf |
|
|
--- |
|
|
## ibm-granite-docling-258M-GGUF |
|
|
|
|
|
This is the GGUF version of the [ibm-granite/granite-docling-258M](https://huggingface.co/ibm-granite/granite-docling-258M) model. It has been converted to GGUF format for optimized inference performance and compatibility with modern model runtimes. |
|
|
|
|
|
### Model Information |
|
|
--- |
|
|
- **Model Name**: granite-docling-258M |
|
|
- **Base Model**: ibm-granite/granite-docling-258M |
|
|
- **License**: Apache-2.0 |
|
|
- **Pipeline Tag**: image-text-to-text |
|
|
- **Language**: English |
|
|
- **Model Size**: 258M |
|
|
- **Model Format**: GGUF |
|
|
|
|
|
### Description |
|
|
--- |
|
|
Granite Docling is a family of instruction-tuned models designed for document understanding tasks. These models are fine-tuned on a diverse set of tasks including document classification, information extraction, and question answering. The models are optimized for performance on document-centric tasks and can handle a variety of document formats and layouts. |
|
|
|
|
|
### Usage |
|
|
--- |
|
|
|
|
|
> **Important:** Make sure to include the `--special` flag so that special tokens are properly returned. |
|
|
> Example: |
|
|
> |
|
|
> ```bash |
|
|
> llama-server -hf danchev/ibm-granite-docling-258m-GGUF --special |
|
|
> ``` |
|
|
|
|
|
<details> |
|
|
<summary> Usage with docker </summary> |
|
|
|
|
|
```bash |
|
|
# Serve the model with Docker |
|
|
docker run --rm -p 8080:8080 ghcr.io/danchev/llama.cpp:docling \ |
|
|
--server \ |
|
|
-hf danchev/ibm-granite-docling-258M-GGUF \ |
|
|
--host 0.0.0.0 \ |
|
|
--port 8080 \ |
|
|
--special # required for docling to work |
|
|
``` |
|
|
</details> |
|
|
|
|
|
<details> |
|
|
<summary> Usage with llama.cpp </summary> |
|
|
|
|
|
```bash |
|
|
# Build llama.cpp from source |
|
|
git clone [email protected]:ggml-org/llama.cpp.git |
|
|
cd llama.cpp |
|
|
cmake -B build |
|
|
cmake --build build --config Release -j $(nproc) |
|
|
|
|
|
# Serve the model |
|
|
./build/bin/llama-server -hf danchev/ibm-granite-docling-258M-GGUF --special |
|
|
``` |
|
|
|
|
|
</details> |
|
|
|
|
|
|
|
|
#### Docling Example: |
|
|
--- |
|
|
#### ๐ PDF Conversion Using `docling` |
|
|
|
|
|
```python |
|
|
#!/usr/bin/env -S uv run --script |
|
|
# /// script |
|
|
# requires-python = ">=3.12" |
|
|
# dependencies = ["docling>=2.58.0", "requests>=2.32.5"] |
|
|
# /// |
|
|
|
|
|
import tempfile |
|
|
import requests |
|
|
from pydantic import AnyUrl |
|
|
from docling.datamodel.base_models import InputFormat |
|
|
from docling.datamodel.pipeline_options import VlmPipelineOptions |
|
|
from docling.datamodel.pipeline_options_vlm_model import ApiVlmOptions, ResponseFormat |
|
|
from docling.document_converter import DocumentConverter, PdfFormatOption |
|
|
from docling.pipeline.vlm_pipeline import VlmPipeline |
|
|
|
|
|
pdf_url = "https://arxiv.org/pdf/1706.03762.pdf" |
|
|
with tempfile.NamedTemporaryFile(suffix=".pdf") as f: |
|
|
f.write(requests.get(pdf_url).content) |
|
|
f.flush() |
|
|
|
|
|
pipeline_options = VlmPipelineOptions( |
|
|
enable_remote_services=True, |
|
|
vlm_options=ApiVlmOptions( |
|
|
url=AnyUrl("http://127.0.0.1:8080/v1/chat/completions"), |
|
|
params={"model": "danchev/ibm-granite-docling-258m-GGUF"}, |
|
|
prompt="Convert this page to docling.", |
|
|
temperature=0.0, |
|
|
response_format=ResponseFormat.DOCTAGS, |
|
|
), |
|
|
) |
|
|
|
|
|
doc_converter = DocumentConverter( |
|
|
format_options={ |
|
|
InputFormat.PDF: PdfFormatOption( |
|
|
pipeline_options=pipeline_options, pipeline_cls=VlmPipeline |
|
|
) |
|
|
} |
|
|
) |
|
|
|
|
|
print(doc_converter.convert(f.name).document.export_to_markdown()) |
|
|
``` |
|
|
|
|
|
#### ๐ผ๏ธ Image Conversion Using `docling-core` |
|
|
|
|
|
```python |
|
|
#!/usr/bin/env -S uv run --script |
|
|
# /// script |
|
|
# requires-python = ">=3.12" |
|
|
# dependencies = ["docling-core>=2.49.0", "Pillow>=11.3.0", "requests>=2.32.5"] |
|
|
# /// |
|
|
|
|
|
import base64 |
|
|
from io import BytesIO |
|
|
from pathlib import Path |
|
|
|
|
|
import requests |
|
|
from docling_core.types.doc.base import ImageRefMode |
|
|
from docling_core.types.doc.document import DoclingDocument, DocTagsDocument |
|
|
from PIL import Image |
|
|
|
|
|
img_url = "https://ibm.biz/docling-page-with-list" |
|
|
img_bytes = requests.get(img_url).content |
|
|
img_b64 = base64.b64encode(img_bytes).decode() |
|
|
|
|
|
doctags = requests.post( |
|
|
url="http://localhost:8080/v1/chat/completions", |
|
|
json={ |
|
|
"model": "danchev/ibm-granite-docling-258M-GGUF", |
|
|
"messages": [ |
|
|
{ |
|
|
"role": "user", |
|
|
"content": [ |
|
|
{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{img_b64}"}}, |
|
|
{"type": "text", "text": "Convert this page to docling."}, |
|
|
], |
|
|
} |
|
|
], |
|
|
} |
|
|
).json()["choices"][0]["message"]["content"] |
|
|
|
|
|
doc = DoclingDocument.load_from_doctags( |
|
|
doctag_document=DocTagsDocument.from_doctags_and_image_pairs( |
|
|
doctags=[doctags], images=[Image.open(BytesIO(img_bytes))]), |
|
|
) |
|
|
|
|
|
print(doc.export_to_markdown()) |
|
|
|
|
|
doc.save_as_html(Path("output.html"), image_mode=ImageRefMode.EMBEDDED) |
|
|
``` |
|
|
|