benchmark with DeepSeek-OCR included

#29
by jzhang533 - opened
PaddlePaddle org

OCR is definitely a hot topic in the community recently, and it's heating up even more after DeepSeek-OCR joined the field.
PaddleOCR-VL team has evaluated and included DeepSeek-OCR (Gundam-M setting) in OmniDocBench v1.5 benchmark. We hope this helps the community better understand diverse OCR approaches and contributes to advancing the field.

benchmark-with-deepseek-ocr-included

PaddlePaddle org
edited 5 days ago

The HTML format version of the above benchmark image, recognized by PaddleOCR-VL.

Model TypeMethodsParametersOverall↑TextEdit↓FormulaCDM↑TableTEDS↑TableTEDS-S↑Reading OrderEdit↓
Pipeline ToolsMarker-1.8.2 [45]-71.300.20676.6657.8871.170.250
Mineru2-pipeline [14]-75.510.20976.5570.9079.110.225
PP-StructureV3 [10]-86.730.07385.7981.6889.480.073
General VLMsGPT-4o [7]-75.020.21779.7067.0776.090.148
InternVL3-76B [46]76B80.330.13183.4270.6477.740.113
InternVL3.5-241B [47]241B82.670.14287.2375.0081.280.125
Qwen2.5-VL-72B [24]72B87.020.09488.2782.1586.220.102
Gemini-2.5 Pro [48]-88.030.07585.8285.7190.290.097
Specialized VLMsDolphin [3]322M74.670.12567.8568.7077.770.124
OCRFlux-3B [49]3B74.820.19368.0375.7580.230.202
Mistral OCR [50]-78.830.16482.8470.0378.040.144
POINTS-Reader [4]3B80.980.13479.2077.1381.660.145
olmOCR-7B [12]7B81.790.09686.0468.9274.770.121
MinerU2-VLM [14]0.9B85.560.07880.9583.5487.660.086
Nanonets-OCR-s [51]3B85.590.09385.9080.1485.570.108
DeepSeek-OCR-Gundam-M3B86.460.08189.4578.0281.550.093
MonkeyOCR-pro-1.2B [1]1.9B86.960.08485.0284.2489.020.130
MonkeyOCR-3B [1]3.7B87.130.07587.4581.3985.920.129
dots.ocr [52]3B88.410.04883.2286.7890.620.053
MonkeyOCR-pro-3B [1]3.7B88.850.07587.2586.7890.630.128
MinerU2.5 [2]1.2B90.670.04788.4688.2292.380.044
PaddleOCR-VL0.9B92.560.03591.4389.7693.520.043

Sign up or log in to comment