本文主要介绍一个通过图像处理改善ocr识别结果的实例,并给出详细步骤和源码。
背景介绍
在很多情况下,文字识别会遇到困难。比如非单一的背景、杂讯干扰、文字部分缺失等。
我们希望识别图中的黑色文字(12-14),但背景较复杂且存在其他干扰,如果直接用tesseract识别(代码如下),识别结果为空。# -*- coding:utf-8 -*- import pytesseractfrom pil import image# 打开图像image = image.open('0.png')# ocr识别:lang默认英文text = pytesseract.image_to_string(image)# 打印识别后的文本print(text)对这种复杂情况的文字识别,直接去识别很容易失败。思考:可不可以通过图像处理将我们需要的部分分割或凸显出来再做识别?本文将以此为例做演示说明。**详细实现步骤
**
【1】otsu二值化image = cv2.imread('0.png')gray = cv2.cvtcolor(image, cv2.color_bgr2gray)_,thresh = cv2.threshold(gray, 0, 255, cv2.thresh_binary_inv | cv2.thresh_otsu)cv2.imshow(otsu, thresh)【2】距离变化 + 归一化
dist = cv2.distancetransform(thresh, cv2.dist_l2, 5)dist = cv2.normalize(dist, dist, 0, 1.0, cv2.norm_minmax)dist = (dist * 255).astype(uint8)cv2.imshow(dist, dist)【3】对距离变换结果图做otsu二值化
_,dist = cv2.threshold(dist, 0, 255, cv2.thresh_binary | cv2.thresh_otsu)cv2.imshow(dist otsu, dist)【4】形态学开运算滤除杂讯
kernel = cv2.getstructuringelement(cv2.morph_ellipse, (7, 7))opening = cv2.morphologyex(dist, cv2.morph_open, kernel)cv2.imshow(opening, opening)【5】轮廓筛选,找出文字区域
black_img = cv2.cvtcolor(opening, cv2.color_gray2bgr)cnts = cv2.findcontours(opening.copy(), cv2.retr_external,cv2.chain_approx_simple)cnts = imutils.grab_contours(cnts)chars = []# loop over the contoursfor c in cnts: # compute the bounding box of the contour (x, y, w, h) = cv2.boundingrect(c) if w >= 35 and h >= 100: chars.append(c)cv2.drawcontours(black_img,chars,-1,(0,255,0),2)cv2.imshow(chars, black_img)【6】计算轮廓凸包,进一步获取文字区域mask
mask = np.zeros(image.shape[:2], dtype=uint8)
cv2.drawcontours(mask, [hull], -1, 255, -1)
mask = cv2.dilate(mask, none, iterations=2)
cv2.imshow(mask, mask)
take the bitwise of the opening image and the mask to reveal justthe characters in the imagefinal = cv2.bitwise_and(opening, opening, mask=mask)
cv2.imshow(final, mask)
【7】tesseract文字识别text = pytesseract.image_to_string(final)# 打印识别后的文本print(text)【8】完整代码:
#公众号:opencv与ai深度学习
import cv2
import numpy as np
import imutils
import pytesseract
image = cv2.imread('0.png')
gray = cv2.cvtcolor(image, cv2.color_bgr2gray)
_,thresh = cv2.threshold(gray, 0, 255, cv2.thresh_binary_inv | cv2.thresh_otsu)
cv2.imshow(otsu, thresh)
dist = cv2.distancetransform(thresh, cv2.dist_l2, 5)
dist = cv2.normalize(dist, dist, 0, 1.0, cv2.norm_minmax)
dist = (dist * 255).astype(uint8)
cv2.imshow(dist, dist)
threshold the distance transform using otsu's method_,dist = cv2.threshold(dist, 0, 255, cv2.thresh_binary | cv2.thresh_otsu)
cv2.imshow(dist otsu, dist)
kernel = cv2.getstructuringelement(cv2.morph_ellipse, (7, 7))
opening = cv2.morphologyex(dist, cv2.morph_open, kernel)
cv2.imshow(opening, opening)
black_img = cv2.cvtcolor(opening, cv2.color_gray2bgr)
cnts = cv2.findcontours(opening.copy(), cv2.retr_external,cv2.chain_approx_simple)
cnts = imutils.grab_contours(cnts)
chars = []
loop over the contoursfor c in cnts:
compute the bounding box of the contour(x, y, w, h) = cv2.boundingrect(c)
if w >= 35 and h >= 100:
chars.append(c)
cv2.drawcontours(black_img,chars,-1,(0,255,0),2)
cv2.imshow(chars, black_img)
chars = np.vstack([chars[i] for i in range(0, len(chars))])
hull = cv2.convexhull(chars)
allocate memory for the convex hull mask, draw the convex hull onthe image, and then enlarge it via a dilationmask = np.zeros(image.shape[:2], dtype=uint8)
cv2.drawcontours(mask, [hull], -1, 255, -1)
mask = cv2.dilate(mask, none, iterations=2)
cv2.imshow(mask, mask)
take the bitwise of the opening image and the mask to reveal justthe characters in the imagefinal = cv2.bitwise_and(opening, opening, mask=mask)
cv2.imshow(final, final)
text = pytesseract.image_to_string(final)
打印识别后的文本print(text)
cv2.waitkey()
cv2.destroyallwindows()
**参考链接**(1)https://pyimagesearch.com/2021/11/22/improving-ocr-results-with-basic-image-processing/(2)https://stackoverflow.com/questions/33881175/remove-background-noise-from-image-to-make-text-more-clear-for-ocr
BMC模压成形配电箱机壳制作工艺的介绍
AC/DC PWM方式反激式转换器设计方法-绝缘型反激式转换器电路设计
AI的极限是什么?听OpenAI联合创始人跟你分析
尼康Z24-70mmf/2.8S评测 问鼎镜皇?
eMMC的相关基础知识(3)
通过图像处理改善OCR识别结果的实例
如何对区块链进行归纳
真菌毒素荧光定量检测仪的产品介绍
贸泽电子与Raydiall Automotive签署全球分销协议
中国制造强国建设迈出坚实步伐 10年保持世界第一制造大国地位
亚信展出最新EtherCAT + IO-Link主站通讯协议堆叠解决方案
谷歌无人车之父Sebastian Thrun :如何看 AI 和人类未来?
双模5G AI芯片Exynos 980发布,vivo和三星共同研发
LED控制装置标准中主要性能要求
光照培养箱的运行原理是怎样的
卷积码编码器怎么画 浅谈卷积码编码器设计
工业智能网关BL110应用之26:实现三菱 PLC FX3U 接入OPC UA云平台
如何运用先进技术来打造智慧工地开启建造新时代
如何防范勒索病毒,目前可防不可解,已在各高校爆发
IC封装基板以及主要厂商介绍