提高 Pytesseract 阅读文本的可靠性

JonathanW 发表于 Dev

12

乔纳森

我试图从屏幕截图中读取相对清晰的数字，但我遇到了让 pytesseract 正确读取文本的问题。我有以下屏幕截图：

而且我知道比分 (2-0) 和时钟 (1:42) 将在完全相同的位置。

这是我目前用于读取时钟时间和橙色分数的代码：

lower_orange = np.array([0, 90, 200], dtype = "uint8")
upper_orange = np.array([70, 160, 255], dtype = "uint8")

    #Isolate scoreboard location on a 1080p pic
    clock = input[70:120, 920:1000]
    scoreboard = input[70:150, 800:1120]

    #greyscale
    roi_gray = cv2.cvtColor(clock, cv2.COLOR_BGR2GRAY)

    config = ("-l eng -c tessedit_char_whitelist=0123456789: --oem 1 --psm 8")
    time = pytesseract.image_to_string(roi_gray, config=config)
    print("time is " + time)

    # find the colors within the specified boundaries and apply
    # the mask
    mask_orange = cv2.inRange(scoreboard, lower_orange, upper_orange)

    # find contours in the thresholded image, then initialize the
    # list of digit locations
    cnts = cv2.findContours(mask_orange.copy(), cv2.RETR_EXTERNAL,
                            cv2.CHAIN_APPROX_SIMPLE)
    cnts = imutils.grab_contours(cnts)
    locs = []

    for (i, c) in enumerate(cnts):
        # compute the bounding box of the contour, then use the
        # bounding box coordinates to derive the aspect ratio
        (x, y, w, h) = cv2.boundingRect(c)
        ar = w / float(h)

        # since score will be a fixed size of about 25 x 35, we'll set the area at about 300 to be safe
        if w*h > 300:
            orange_score_img = mask_orange[y-5:y+h+5, x-5:x+w+5]
            orange_score_img = cv2.GaussianBlur(orange_score_img, (5, 5), 0)

            config = ("-l eng -c tessedit_char_whitelist=012345 --oem 1 --psm 10")
            orange_score = pytesseract.image_to_string(orange_score_img, config=config)
            print("orange_score is " + orange_score)

这是输出：

time is 1:42
orange_score is

这是orange_score_img，在我掩盖了我的上下橙色边界内的所有内容并应用了高斯模糊之后。

然而此时，即使我将 pytesseract 配置为搜索 1 个字符并限制了白名单，我仍然无法正确读取它。我是否缺少一些额外的后处理来帮助 pytesseract 将此数字读为 2？

乔纳森

根据@fmw42 的建议，我尝试进行一些形态变化。加厚数字似乎可以解决问题！

kernel = np.ones((5,5),np.uint8) orange_score_img = cv2.dilate(orange_score_img,kernel,iterations=1)

编辑：我意识到，真正的答案是 pytesseract 在白色背景上的黑色文本比在黑色背景上的白色文本要好得多！当我反转颜色时，它读起来很完美：

orange_score_img = cv2.bitwise_not(orange_score_img)

我希望这对人们第一次开始使用 pytesseract 有所帮助！试图调整图像以适合我的所有情况非常令人沮丧，并且知道白底黑字效果更好会为我节省数小时......

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-08-5

我来说两句

0 条评论

登录后参与评论

上一篇：使用python中的关键字列表进行Google搜索网络抓取

相关文章

Pytesseract提高OCR精度

通过pytesseract和PIL提高文本识别的准确性

如何使用 opencv 和 pytesseract 提高数字 OCR 的准确性

Google Cloud SQL高可用性真的在提高可靠性吗？

如何提高电子邮件发送和传递的可靠性？

Pytesseract 阅读回执

Pytesseract阅读文本时的随机错误

如何通过物联网提高AWS Lambda和Web应用之间的MQTT消息的可靠性

Pytesseract OCR错误文本识别

阅读条形码pytesseract python下面的文本

ZeroMQ的可靠性？

UDP的可靠性

Pytesseract，试图从屏幕上检测文本

如何使用pytesseract从图像中提取文本？

从图像中改善pytesseract正确的文本识别

无法使用pytesseract从图像获取文本

Pytesseract 或 Keras OCR 从图像中提取文本

Pytesseract 无法识别图像中的简单文本

提高情节可读性

提高reCaptcha安全性

WebRTC频道可靠性

VM快照可靠性？

inotifywait循环的可靠性

BSOD的可靠性如何？

提高h5py的阅读速度

如何提高文件阅读器的功能？

从文本检测器的 bbox 提高 ocr 准确性

如何使用 pytesseract 检测渐变背景上的彩色文本

无法在python中使用pytesseract从tif图像中提取文本

TOP 榜单

文章

热门标签

归档