Why is Tesseract unable to detect the single digit in that image?

marco

I have this image, and I'm trying to read it with Tesseract:

enter image description here

My code is like that:

pytesseract.image_to_string(im)

But, what I get is only LOW: 56. So, Tesseract is unable to read the 1 in the first line. I've tried to specify also a whitelist of only digits like

pytesseract.image_to_string(im, config="tessedit_char_whitelist=0123456789.")

and to process the image with an erosion but nothing works. Any suggestions?

HansHirse

Improving the quality of the output is your "holy scripture" when working with Tesseract. Especially, the page segmentation method should always be explicitly set. Here (as most of the times), I'd opt for --psm 6:

Assume a single uniform block of text.

Even without further preprocessing of your image, you already get the desired result:

import cv2
import pytesseract

image = cv2.imread('gBrcd.png')
text = pytesseract.image_to_string(image, config='--psm 6')
print(text.replace('\f', ''))
# 1
# LOW: 56
----------------------------------------
System information
----------------------------------------
Platform:      Windows-10-10.0.19041-SP0
Python:        3.9.1
PyCharm:       2021.1.1
OpenCV:        4.5.2
pytesseract:   5.0.0-alpha.20201127
----------------------------------------

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

how to detect words in an image with OpenCV and Tesseract properly

Why are only single digit integer inputs working and not double digit integers?

Digit Recognition with Tesseract python

How to detect digits from image by using Tesseract 5?

Regex expression in R to detect single digit and two digits keeping the delimiters

Tesseract-OCR not recognizing multiple characters in a single image

Digit recognition with Tesseract OCR and python

Why does tesseract fail to read text off this simple image?

C, why does printf add a "D" after a single digit long?

Why python prints a single multi-digit numbers in multiple lines?

Why Python counts 0 of starting non single digit as a string?

Tesseract & OpenCV - Processing Image

How to preprocess this image for Tesseract?

Grep for single and double digit

Python Single digit input

regex single digit

Why do I get such poor results from Tesseract for simple single character recognizing?

Unable to add double digits in a calculator. Can only add single digit numbers

Why is tesseract not accepting the config?

How to OCR single character with tesseract?

Recognize single characters on a page with Tesseract

Convert single digit months to double digit

Regex for single digit catches two digit numbers?

Why is my webapp unable to detect the media query attribute?

Why does tess-two show different result than tesseract for windows (by UB Mannheim) for the same image?

Why does DateTime.ParseExact format "dd MMM yyyy" throw exception if a single digit is sent

Why unable to save image to db in django?

Detect white characters on black background using Tesseract

Why is Long unable to accept 12 digit value even though I explicitly declared it to?