'How to remove noise artifacts from an image for OCR with Python OpenCV?
I have subsets of images that contains digits. Each subset is read by Tesseract for OCR. Unfortunately for some images the cropping from the original image isn't optimal.
Hence some artifacts/remains at the top and bottom of the image and hamper Tesseract to recognize characters on the image. Then I would like to get rid of these artifacts and get to a similar result:
First I considered a simple approach: I set the first row of pixels as the reference: if an artifact was found along the x-axis (i.e., a white pixel if the image is binarized), I removed it along the y-axis until the next black pixel. Code for this approach is the one below:
import cv2
inp = cv2.imread("testing_file.tif")
inp = cv2.cvtColor(inp, cv2.COLOR_BGR2GRAY)
_,inp = cv2.threshold(inp, 150, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
ax = inp.shape[1]
ay = inp.shape[0]
out = inp.copy()
for i in range(ax):
j = 0
while j in range(ay):
if out[j,i] == 255:
out[j,i] = 0
else:
break
j+=1
out = cv2.bitwise_not(out)
cv2.imwrite('output.png',out)
But the result isn't good at all:
Then I stumbled across the flood_fill function from scipy (here) but found out it was too much time consuming and still not efficient. A similar question was asked on SO here but didn't help so much. Maybe a k-nearest neighbor approach could be considered? I also found out that methods that consist in merging neighbors pixels under some criteria were called growing methods, among which the single linkage is the most common (here).
What would you recommend to remove the upper and lower artifacts?
Solution 1:[1]
Here's a simple approach:
- Convert image to grayscale
- Otsu's threshold to obtain binary image
- Cerate special horizontal kernel and dilate
- Detect horizontal lines, sort for largest contour, and draw onto mask
- Bitwise-and
After converting to grayscale, we Otsu's threshold to get a binary image
# Read in image, convert to grayscale, and Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
Next we create a long horizontal kernel and dilate to connect the numbers together
# Create special horizontal kernel and dilate
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (70,1))
dilate = cv2.dilate(thresh, horizontal_kernel, iterations=1)
From here we detect horizontal lines and sort for the largest contour. The idea is that the largest contour will be the middle section of the numbers where the numbers are all "complete". Any smaller contours will be partial or cut off numbers so we filter them out here. We draw this largest contour onto a mask
# Detect horizontal lines, sort for largest contour, and draw on mask
mask = np.zeros(image.shape, dtype=np.uint8)
detected_lines = cv2.morphologyEx(dilate, cv2.MORPH_OPEN, horizontal_kernel, iterations=1)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
for c in cnts:
cv2.drawContours(mask, [c], -1, (255,255,255), -1)
break
Now that we have the outline of the desired numbers, we simply bitwise-and with our original image and color the background white to get our result
# Bitwise-and to get result and color background white
mask = cv2.cvtColor(mask,cv2.COLOR_BGR2GRAY)
result = cv2.bitwise_and(image,image,mask=mask)
result[mask==0] = (255,255,255)
Full code for completeness
import cv2
import numpy as np
# Read in image, convert to grayscale, and Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Create special horizontal kernel and dilate
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (70,1))
dilate = cv2.dilate(thresh, horizontal_kernel, iterations=1)
# Detect horizontal lines, sort for largest contour, and draw on mask
mask = np.zeros(image.shape, dtype=np.uint8)
detected_lines = cv2.morphologyEx(dilate, cv2.MORPH_OPEN, horizontal_kernel, iterations=1)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
for c in cnts:
cv2.drawContours(mask, [c], -1, (255,255,255), -1)
break
# Bitwise-and to get result and color background white
mask = cv2.cvtColor(mask,cv2.COLOR_BGR2GRAY)
result = cv2.bitwise_and(image,image,mask=mask)
result[mask==0] = (255,255,255)
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('result', result)
cv2.waitKey()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | nathancy |