How to detect digits from image by using Tesseract 5?

Static

I installed tesseract5 on WSL (Ubuntu 22.04.1LTS) and tried to detect numbers from images as follows, but Tesseract returned wrong answers. How can I get right answers?

My environment:

  • Windows 11 22H2
  • WSL2 Ubuntu 22.04.1LTS
  • tesseract 5.3.1-20-g58b7

I tried Tesseract like this

tesseract hoge.jpg output -l eng

and output.txt is

Fb¥
&/0

Here is hoge.jpg.

enter image description here

Thank you for helping in advance. I'm a Japanese student, so my English may be not so good. If you think it's not clear English, please change this post to make it more readable.

Hermann12

From bad picture you will never get good results. I played a bit and get this one:

import subprocess
import cv2
import pytesseract

# Image manipulation
# Commands https://imagemagick.org/script/convert.php
mag_img = r'D:\Programme\ImageMagic\magick.exe'
con_bw = r"D:\Programme\ImageMagic\convert.exe" 

in_file = r'ZZ_Numbers.jpg'
out_file = r'ZZ_Numbers_bw.png'

# Play with black and white and rotate for better results
process = subprocess.run([con_bw, in_file, "-resize", "70%","-threshold","60%", "-rotate", "-17", "-brightness-contrast","-15x30",out_file])

# Text ptocessing
pytesseract.pytesseract.tesseract_cmd=r'C:\Program Files\Tesseract-OCR\tesseract.exe'
img = cv2.imread(out_file)

# Parameters see tesseract doc 
custom_config = r'--psm 11 --oem 3 tessedit_char_whitelist=0123456789' 

tex = pytesseract.image_to_string(img, config=custom_config)
print(tex)

with open("cartootn.txt", 'w') as f:
    f.writelines(tex)

cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Output: enter image description here

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

how to detect words in an image with OpenCV and Tesseract properly

How to detect digits from license plates using OpenCV and C++?

How to convert or extract a table from an image using Tesseract?

Unexpected result of Tesseract while using 'digits' flag

Tesseract: cannot read digits from pixelated font

How to identify single digits from image?

How to preprocess this image for Tesseract?

Why is Tesseract unable to detect the single digit in that image?

How to extract last 5 digits from a link

How can I read a full sequence of digits using tesseract instead of first digit only

Detect a Parallelogram from an Image using opencv and Numpy

Detect white characters on black background using Tesseract

detect patternts and digits in image with openCV and python

How to detect subscript numbers in an image using OCR?

How to detect if image is all black using flutter?

How to detect Faces for Local Image using Clarifai

How to increase the quality of this image for Tesseract?

How to detect only vertical edges from the image?

How to detect robot direction from Image?

How to detect edges from image matrix

Having problem with digits recognition in python using opencv, tesseract

How to detect only single color such as Red, Blue or Green from an image using Java or Processing?

image file not found using command prompt and Tesseract (windows build from UB Mannheim)

tesseract detects only 4 words from image

Tesseract fails to parse text from image

How to detect object from video using SVM

How to use dart to remove all but digits from string using RegExp

How to remove initial and final digits from strings using bash scripting

How to sum only digits from a list using decorate?