'7-Segment Display OCR

I'm building an iOS application (take a picture and run OCR on it) using Tesseract (an OCR library) and it is working very well with well written numbers and characters (using usual fonts).

The problem I am having is that if I try it on a 7-Segment Display, it gives very very bad results.

So my question is: Does anyone know how I can approach this problem? Is there a way for Tesseract to recognize these characters?



Solution 1:[1]

I haven't tried OCRing 7-Segment Display, but I suspect that the problem might be caused by the characters not being connected components. Tesseract does not handle disconnected fonts well from my experience.

Simple erosion (image preprocessing) might help by connecting segments, but you would have to test it and play with kernel size to prevent too much distortion.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Tomasz Niedabylski