ibm-granite-docling-258M-GGUF

This is the GGUF version of the ibm-granite/granite-docling-258M model. It has been converted to GGUF format for optimized inference performance and compatibility with modern model runtimes.

Model Information


  • Model Name: granite-docling-258M
  • Base Model: ibm-granite/granite-docling-258M
  • License: Apache-2.0
  • Pipeline Tag: image-text-to-text
  • Language: English
  • Model Size: 258M
  • Model Format: GGUF

Description


Granite Docling is a family of instruction-tuned models designed for document understanding tasks. These models are fine-tuned on a diverse set of tasks including document classification, information extraction, and question answering. The models are optimized for performance on document-centric tasks and can handle a variety of document formats and layouts.

Usage


Important: Make sure to include the --special flag so that special tokens are properly returned. Example:

llama-server -hf danchev/ibm-granite-docling-258m-GGUF --special
Usage with docker
# Serve the model with Docker
docker run --rm -p 8080:8080 ghcr.io/danchev/llama.cpp:docling \
  --server \
  -hf danchev/ibm-granite-docling-258M-GGUF \
  --host 0.0.0.0 \
  --port 8080 \
  --special  # required for docling to work
Usage with llama.cpp
# Build llama.cpp from source
git clone [email protected]:ggml-org/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build --config Release -j $(nproc)

# Serve the model
./build/bin/llama-server -hf danchev/ibm-granite-docling-258M-GGUF --special

Docling Example:


๐Ÿ“„ PDF Conversion Using `docling`
#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.12"
# dependencies = ["docling>=2.58.0", "requests>=2.32.5"]
# ///

import tempfile
import requests
from pydantic import AnyUrl
from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import VlmPipelineOptions
from docling.datamodel.pipeline_options_vlm_model import ApiVlmOptions, ResponseFormat
from docling.document_converter import DocumentConverter, PdfFormatOption
from docling.pipeline.vlm_pipeline import VlmPipeline

pdf_url = "https://arxiv.org/pdf/1706.03762.pdf"
with tempfile.NamedTemporaryFile(suffix=".pdf") as f:
    f.write(requests.get(pdf_url).content)
    f.flush()

    pipeline_options = VlmPipelineOptions(
        enable_remote_services=True,
        vlm_options=ApiVlmOptions(
            url=AnyUrl("http://127.0.0.1:8080/v1/chat/completions"),
            params={"model": "danchev/ibm-granite-docling-258m-GGUF"},
            prompt="Convert this page to docling.",
            temperature=0.0,
            response_format=ResponseFormat.DOCTAGS,
        ),
    )

    doc_converter = DocumentConverter(
        format_options={
            InputFormat.PDF: PdfFormatOption(
                pipeline_options=pipeline_options, pipeline_cls=VlmPipeline
            )
        }
    )

    print(doc_converter.convert(f.name).document.export_to_markdown())
๐Ÿ–ผ๏ธ Image Conversion Using `docling-core`
#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.12"
# dependencies = ["docling-core>=2.49.0", "Pillow>=11.3.0", "requests>=2.32.5"]
# ///

import base64
from io import BytesIO
from pathlib import Path

import requests
from docling_core.types.doc.base import ImageRefMode
from docling_core.types.doc.document import DoclingDocument, DocTagsDocument
from PIL import Image

img_url = "https://ibm.biz/docling-page-with-list"
img_bytes = requests.get(img_url).content
img_b64 = base64.b64encode(img_bytes).decode()

doctags = requests.post(
    url="http://localhost:8080/v1/chat/completions",
    json={
        "model": "danchev/ibm-granite-docling-258M-GGUF",
        "messages": [
            {
                "role": "user",
                "content": [
                    {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{img_b64}"}},
                    {"type": "text", "text": "Convert this page to docling."},
                ],
            }
        ],
    }
).json()["choices"][0]["message"]["content"]

doc = DoclingDocument.load_from_doctags(
    doctag_document=DocTagsDocument.from_doctags_and_image_pairs(
        doctags=[doctags], images=[Image.open(BytesIO(img_bytes))]),
)

print(doc.export_to_markdown())

doc.save_as_html(Path("output.html"), image_mode=ImageRefMode.EMBEDDED)
Downloads last month
562
GGUF
Model size
0.2B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for danchev/ibm-granite-docling-258M-GGUF

Quantized
(7)
this model