ibm-granite-docling-258M-GGUF / README.md

D. Danchev

docs: Clarify running instructions and add example

d176c11 about 1 month ago

4.79 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- ibm-granite/granite-docling-258M
	pipeline_tag: image-text-to-text
	tags:
	- gguf
	---
	## ibm-granite-docling-258M-GGUF

	This is the GGUF version of the [ibm-granite/granite-docling-258M](https://huggingface.co/ibm-granite/granite-docling-258M) model. It has been converted to GGUF format for optimized inference performance and compatibility with modern model runtimes.

	### Model Information
	---
	- Model Name: granite-docling-258M
	- Base Model: ibm-granite/granite-docling-258M
	- License: Apache-2.0
	- Pipeline Tag: image-text-to-text
	- Language: English
	- Model Size: 258M
	- Model Format: GGUF

	### Description
	---
	Granite Docling is a family of instruction-tuned models designed for document understanding tasks. These models are fine-tuned on a diverse set of tasks including document classification, information extraction, and question answering. The models are optimized for performance on document-centric tasks and can handle a variety of document formats and layouts.

	### Usage
	---

	> Important: Make sure to include the `--special` flag so that special tokens are properly returned.
	> Example:
	>
	> ```bash
	> llama-server -hf danchev/ibm-granite-docling-258m-GGUF --special
	> ```

	<details>
	<summary> Usage with docker </summary>

	```bash
	# Serve the model with Docker
	docker run --rm -p 8080:8080 ghcr.io/danchev/llama.cpp:docling \
	--server \
	-hf danchev/ibm-granite-docling-258M-GGUF \
	--host 0.0.0.0 \
	--port 8080 \
	--special # required for docling to work
	```
	</details>

	<details>
	<summary> Usage with llama.cpp </summary>

	```bash
	# Build llama.cpp from source
	git clone [email protected]:ggml-org/llama.cpp.git
	cd llama.cpp
	cmake -B build
	cmake --build build --config Release -j $(nproc)

	# Serve the model
	./build/bin/llama-server -hf danchev/ibm-granite-docling-258M-GGUF --special
	```

	</details>


	#### Docling Example:
	---
	#### 📄 PDF Conversion Using `docling`

	```python
	#!/usr/bin/env -S uv run --script
	# /// script
	# requires-python = ">=3.12"
	# dependencies = ["docling>=2.58.0", "requests>=2.32.5"]
	# ///

	import tempfile
	import requests
	from pydantic import AnyUrl
	from docling.datamodel.base_models import InputFormat
	from docling.datamodel.pipeline_options import VlmPipelineOptions
	from docling.datamodel.pipeline_options_vlm_model import ApiVlmOptions, ResponseFormat
	from docling.document_converter import DocumentConverter, PdfFormatOption
	from docling.pipeline.vlm_pipeline import VlmPipeline

	pdf_url = "https://arxiv.org/pdf/1706.03762.pdf"
	with tempfile.NamedTemporaryFile(suffix=".pdf") as f:
	f.write(requests.get(pdf_url).content)
	f.flush()

	pipeline_options = VlmPipelineOptions(
	enable_remote_services=True,
	vlm_options=ApiVlmOptions(
	url=AnyUrl("http://127.0.0.1:8080/v1/chat/completions"),
	params={"model": "danchev/ibm-granite-docling-258m-GGUF"},
	prompt="Convert this page to docling.",
	temperature=0.0,
	response_format=ResponseFormat.DOCTAGS,
	),
	)

	doc_converter = DocumentConverter(
	format_options={
	InputFormat.PDF: PdfFormatOption(
	pipeline_options=pipeline_options, pipeline_cls=VlmPipeline
	)
	}
	)

	print(doc_converter.convert(f.name).document.export_to_markdown())
	```

	#### 🖼️ Image Conversion Using `docling-core`

	```python
	#!/usr/bin/env -S uv run --script
	# /// script
	# requires-python = ">=3.12"
	# dependencies = ["docling-core>=2.49.0", "Pillow>=11.3.0", "requests>=2.32.5"]
	# ///

	import base64
	from io import BytesIO
	from pathlib import Path

	import requests
	from docling_core.types.doc.base import ImageRefMode
	from docling_core.types.doc.document import DoclingDocument, DocTagsDocument
	from PIL import Image

	img_url = "https://ibm.biz/docling-page-with-list"
	img_bytes = requests.get(img_url).content
	img_b64 = base64.b64encode(img_bytes).decode()

	doctags = requests.post(
	url="http://localhost:8080/v1/chat/completions",
	json={
	"model": "danchev/ibm-granite-docling-258M-GGUF",
	"messages": [
	{
	"role": "user",
	"content": [
	{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{img_b64}"}},
	{"type": "text", "text": "Convert this page to docling."},
	],
	}
	],
	}
	).json()["choices"][0]["message"]["content"]

	doc = DoclingDocument.load_from_doctags(
	doctag_document=DocTagsDocument.from_doctags_and_image_pairs(
	doctags=[doctags], images=[Image.open(BytesIO(img_bytes))]),
	)

	print(doc.export_to_markdown())

	doc.save_as_html(Path("output.html"), image_mode=ImageRefMode.EMBEDDED)
	```