'Extracting Handwritten text values from a image using Tesseract-OCR library Python

Looking for extracting handwritten text from uploaded image. I am tried using OCR library Tesseract() in Java API/Python with pytesseract. DataPath, Language eng.trainingdata are already set.

ITesseract _tesseract = new Tesseract();
result = _tesseract.doOCR(file);

Input is like a form which is filled by human (black-white form). Image contains printed as well as hand-written content. The output string contains part of the content from the given input image file.jpg. A majority of data is missing in the result variable String.

I would like to get all values printed as well as hand-written on the input form.

P.S I am tried with Python as well as.

pytesseract.pytesseract.tesseract_cmd= r'C://Program Files/Tesseract-OCR/tesseract.exe' print(pytesseract.image_to_string(r'sample1.jpg'))

Eg: Results from both approach returns either wrongly-spelled or missed out data.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source