'Pytesseract - OCR on image with colored text

im trying to use Pytesseract to get some text in an image. However, the text is an orange color and the background has both black and white. I have tried several options but ultimately I'm unable to read the text using Pytesseract. Below is a sample of the image:

Here is the code I have arrived at:

import pytesseract
from PIL import Image,ImageOps
import numpy as np

img = Image.open("OCR.png").convert("L")
img = ImageOps.invert(img)
# img.show()
threshold = 240
table = []
pixelArray = img.load()
for y in range(img.size[1]):  # binaryzate it
    List = []
    for x in range(img.size[0]):
        if pixelArray[x,y] < threshold:
            List.append(0)
        else:
            List.append(255)
    table.append(List)

img = Image.fromarray(np.array(table, dtype="uint8")) # load the image from array.
# img.show()

print(pytesseract.image_to_string(img))

The code above results in an all black image. Text becomes black too

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Pytesseract - OCR on image with colored text

Sources

Related Questions