'Pytesseract OCR - config options to help for these license plate images, and quality questions

I am trying to use PyTesseract to extract the text of license plates that I have recognized using another library. I will paste below the raw images extracted that I am trying to use, and then show the code and the (limited) results so far.

Code:

gray = cv2.cvtColor(box, cv2.COLOR_RGB2GRAY)
# resize image to three times as large as original for better readability
gray = cv2.resize(gray, None, fx = 3, fy = 3, interpolation = cv2.INTER_CUBIC)

text2 = pytesseract.image_to_string(gray, config='-c tessedit_char_whitelist=0123456789 --psm 8 --oem 3') 
print(gray.shape)
print("text2 :: "+str(text2))

cv2.imshow("gray", gray)
cv2.waitKey(0)

Example results:

enter image description here

(102, 270)
text2 ::      # empty, no results

enter image description here

(90, 261)
text2 :: 82701   # clearly wrong

enter image description here

(135, 246)
text2 ::      # empty, no results

enter image description here

(129, 207)
text2 :: 07         # clearly wrong

enter image description here

(96, 288)
text2 :: 1369034    # at least it tried?

You get the idea. It's mostly not recognizing anything, and when it does, it's never correct. I'm wondering if there are limitations with PyTesseract based on the image quality I'm giving in, maybe understanding this would help me scope my project better.

For reference, here is a typical full image I'm using, and the license plate recognition which is doing a pretty decent job, and is then getting cropped and passed to PyTesseract. Unfortunately it seems to come from a not-so-great quality stream:

enter image description here

How can I better use PyTesseract for this task? And where can I learn about the limitations?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source