'How to find and crop words into individual images with Python OpenCV?
I have a binary image of words as shown, and I want crop the image with each character in different image. Output should have different images of k,7,2,f,5 & m. I tried using OpenCV in python, but due to some reason I'm not able to extract it. If I can plot a box over each text then also, it'll be good enough.
Solution 1:[1]
Here's a simple approach:
- Convert to grayscale
- Otsu's threshold
- Find contours, sort contours from left-to-right, and filter using contour area
- Extract ROI
After Otsu's thresholding to obtain a binary image, we sort contours from left-to-right using imutils.contours.sort_contours()
. This ensures that when we iterate through each contour, we have each character in the correct order. In addition, we filter using a minimum threshold area to remove small noise. Here's the detected characters
We can extract each character using Numpy slicing. Here's each saved character ROI
If you want the other way, simply invert it
ROI = 255 - image[y:y+h, x:x+w]
import cv2
from imutils import contours
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray,0,255,cv2.THRESH_OTSU + cv2.THRESH_BINARY)[1]
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts, _ = contours.sort_contours(cnts, method="left-to-right")
ROI_number = 0
for c in cnts:
area = cv2.contourArea(c)
if area > 10:
x,y,w,h = cv2.boundingRect(c)
ROI = 255 - image[y:y+h, x:x+w]
cv2.imwrite('ROI_{}.png'.format(ROI_number), ROI)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 1)
ROI_number += 1
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.waitKey()
Solution 2:[2]
If your image is mostly black, and the characters are nicely spaced as in your example, you can simply look for vertical areas that consist of only black space. You can make this a little more robust by doing an edge detection first. However, if the kerning is such that the characters overlap, or there is noise, then the problem is incredibly difficult, and you'll need more help than a simple StackOverflow answer will afford you.
Here is one way to do edge detection, and then find the vertical spacings:
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread("ZrnKr.png", cv2.IMREAD_GRAYSCALE)
edges = cv2.Canny(img,100,200)
vertical_sum = np.sum(edges, axis=0)
vertical_sum = vertical_sum != 0
changes = np.logical_xor(vertical_sum[1:], vertical_sum[:-1])
change_pts = np.nonzero(changes)[0]
plt.imshow(img)
for change in change_pts:
plt.axvline(change+1)
plt.show()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Him |