'How can I retain background after dilating text in image

import cv2
import numpy as np

# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Create rectangular structuring element and dilate
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
dilate = cv2.dilate(thresh, kernel, iterations=4)


cv2.imshow('dilate', dilate)

cv2.waitKey()

I am trying to mask the text elements in an image and return an image with just the remaining portions. I have applied thresholding and dilating, but how can I retain the background.

Image after thresholding and dilating

image after thresholding and dilating

Original image:

original image



Solution 1:[1]

Here is a simple approach:

Using the inverted dilated image cv2.bitwise_not(dilate), create a mask over the original image.

res = cv2.bitwise_and(image, image, mask=cv2.bitwise_not(dilate))

enter image description here

In the above image you have all text regions and its boundaries masked out.

Now replace those masked out regions with the background of your original image. To do that, first I noted down the coordinates where of the text regoins in mask_ind. Then replaced the pixel values in those regions with the background of the original image image[0,0]

mask_ind = (dilate == 255)
res[mask_ind] = image[0,0]
cv2.imshow(res)

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jeru Luke