granite-vision-3.3-2b-chart2csv-preview

Model Summary:

Chart2CSV is a specialized vision-language model fine-tuned for the accurate extraction of tabular data from charts and visualizations. Built on top of ibm-granite/granite-vision-3.3-2b, it produces machine-readable CSV outputs with improved numeric fidelity compared to general-purpose VLMs. The model is trained using code-guided synthetic chart data following the ChartGen methodology, which strengthens factual grounding and reduces hallucination in the Chart-to-CSV task.

Intended Use:

The model is intended for workflows that require precise extraction of chart data into structured tables, including but not limited to:

  • Integration within Docling-based document parsing pipelines for structured data enrichment (Chart data extraction with Docling)
  • Enabling multimodal document understanding systems that jointly reason over charts, text, and tables
  • Direct extraction of structured data from charts embedded in reports, presentations, and PDFs
  • Providing structured inputs for downstream workflow automation and analytics systems
  • Supporting large-scale ingestion of financial or industry documents where raw tabular data is unavailable

The model's outputs are designed to be CSV-ready when used with the recommended Chart2CSV extraction prompt, enabling seamless downstream analysis with data tools like pandas, SQL, or spreadsheet software.

Generation:

This is a simple example of how to use the granite-vision-3.3-2b-chart2csv-preview model with the specific Chart-to-CSV prompt.

from transformers import AutoProcessor, AutoModelForVision2Seq
from huggingface_hub import hf_hub_download
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

model_path = "ibm-granite/granite-vision-3.3-2b-chart2csv-preview"
processor = AutoProcessor.from_pretrained(model_path)
model = AutoModelForVision2Seq.from_pretrained(model_path).to(device)

# prepare image and text prompt, using the appropriate prompt template

img_path = hf_hub_download(repo_id=model_path, filename='example.jpg')

conversation = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": img_path},
            {"type": "text", "text": "Convert the information in this chart into a data table in CSV format."},
        ],
    },
]
inputs = processor.apply_chat_template(
    conversation,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt"
).to(device)


# autoregressively complete prompt
output = model.generate(**inputs, max_new_tokens=500)
print(processor.decode(output[0], skip_special_tokens=True))

Evaluations:

We compare the performance of granite-vision-3.3-2b-chart2csv-preview with other vision-language models (VLMs) on an internal Chart-to-CSV benchmark. In this task, models generate a CSV table from a chart image, and outputs are compared to ground-truth data using an LLM-based judge that measures similarity while ignoring minor formatting differences.

Chart2CSV
chartgemma 37.1
granite-vision-3.3-2b 53.8
Qwen3-VL-4B-Instruct 58.1
InternVL3-8B 56.1
Pixtral-12B-2409 49.1
Mistral-Samll-3.1-24B-Instruct-2503 53.2
Qwen2-VL-72B-Instruct 50.3
GPT-4o 46.7
granite-vision-3.3-2b-chart2csv-preview 70.3

Model Architecture:

The granite-vision-3.3-2b-chart2csv-preview model uses the same architecture as the granite-vision-3.3-2b.

Infrastructure:

We train granite-vision-3.3-2b-chart2csv-preview using IBM's supercomputing cluster, Blue Vela, which is outfitted with NVIDIA H100 GPUs. This cluster provides a scalable and efficient infrastructure for training our models over thousands of GPUs.

Responsible Use and Limitations:

Some use cases for Chart-to-CSV extraction systems can trigger certain risks and operational considerations, including but not limited to: numeric inaccuracies, propagation of errors into downstream analytics pipelines, and misinterpretation of ambiguous chart styles. Although Chart2CSV is optimized to reduce hallucination and improve numeric fidelity, it may still produce inaccurate or incomplete tables in cases of low-resolution images, complex layouts, overlapping visual elements, or unconventional chart designs. Since Chart2CSV extracts structured numeric data, incorrect outputs may impact financial, scientific, or industry analysis workflows if not validated. We recommend human verification or automated consistency checks in high-stakes applications. Chart2CSV is optimized specifically for chart-to-CSV extraction and may not perform reliably on general vision-language tasks outside this scope.

Resources

Downloads last month
51
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ibm-granite/granite-vision-3.3-2b-chart2csv-preview

Finetuned
(3)
this model

Collection including ibm-granite/granite-vision-3.3-2b-chart2csv-preview

Papers for ibm-granite/granite-vision-3.3-2b-chart2csv-preview