donut-base-finetuned-sroie-v2

This model is a fine-tuned version of naver-clova-ix/donut-base on an sam749/SROIE-donut dataset.

Use

from transformers import DonutProcessor, VisionEncoderDecoderModel

device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.float16 if torch.cuda.is_available() else torch.float32

processor = DonutProcessor.from_pretrained("sam749/donut-base-finetuned-sroie-v2")
model = VisionEncoderDecoderModel.from_pretrained("sam749/donut-base-finetuned-sroie-v2", dtype=dtype)
model.to(device)

def generate(image):
    # prepare encoder inputs
    pixel_values = processor(image, return_tensors="pt").pixel_values
    
    # generate answer
    outputs = model.generate(
        pixel_values.to(device),
        use_cache=True,
        num_beams=1,
        bad_words_ids=[[processor.tokenizer.unk_token_id]],
        return_dict_in_generate=True,
    )
    
    # postprocess
    sequence = processor.batch_decode(outputs.sequences)[0]
    sequence = sequence.replace(processor.tokenizer.eos_token, "").replace(processor.tokenizer.pad_token, "")
    sequence = re.sub(r"<.*?>", "", sequence, count=1).strip()  # remove first task start token
    
    return processor.token2json(sequence)

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Framework versions

  • Transformers 4.48.1
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
121
Safetensors
Model size
0.2B params
Tensor type
I64
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sam749/donut-base-finetuned-sroie-v2

Finetuned
(479)
this model

Dataset used to train sam749/donut-base-finetuned-sroie-v2

Space using sam749/donut-base-finetuned-sroie-v2 1

Collection including sam749/donut-base-finetuned-sroie-v2