Category "ocr"

Pytesseract OCR - config options to help for these license plate images, and quality questions

I am trying to use PyTesseract to extract the text of license plates that I have recognized using another library. I will paste below the raw images extracted t

Tesseract returns nothing for Arabic words/letters

I have installed Pytesseract and it's working perfectly on French/English text and also in numbers. But when I try to read any Arabic text/letter it doesn't ret

How to install language in tesseract OCR

I have installed tesseract OCR and it has only 'eng' and 'osd' in the language list. I need german language. I tired following command brew install tesseract-

Google Cloud Platform - Vertex AI training with custom data format

I need to train a custom OCR in vertex AI. My data with have folder of cropped image, each image is a line, and a csv file with 2 columns: image name and text i

Java TESSERACT create byte[] instead of pdf file - tessInstance.createDocuments()

Is it possible to generate with Tess4j the byte[] of a PDF with OCR instead of a physical file? I need to make PDF files searchable via OCR, it works but I woul

Use pytesseract OCR to recognize text from an image

I need to use Pytesseract to extract text from this picture: and the code: from PIL import Image, ImageEnhance, ImageFilter import pytesseract path = 'pic.gif'

Python OpenCV skew correction for OCR

Currently, I am working on an OCR project where I need to read the text off of a label (see example images below). I am running into issues with the image skew

AWS textract-trp package issue - cannot extract key-value pair

I'm using AWS Lambda running on Python 3.8 to run this code example below: import boto3 from trp import Document # Document documentName = "employmentapp.png"

Remove header and footer from pdftotext module in Python

I am using pdftotext python package to extract text from pdf however I need to remove headers and footers from the text file to extract only the content. There

Algorithm for straightening tilted document

I'm on a project involving OCR. After detecting each character, I need to combine close characters to create words. To do that I tried to create a priority queu

How to clean images before OCR with Python OpenCV?

I've been trying to clear images for OCR: (the lines) I need to remove these lines to sometimes further process the image and I'm getting pretty close but a

How to clean images before OCR with Python OpenCV?

I've been trying to clear images for OCR: (the lines) I need to remove these lines to sometimes further process the image and I'm getting pretty close but a

Filtering OCR Result [closed]

I'am working on OCR, which I have working, but now I'm stuck on how to filter the OCR Result to move each string into a set of text fields. F

How to remove noise artifacts from an image for OCR with Python OpenCV?

I have subsets of images that contains digits. Each subset is read by Tesseract for OCR. Unfortunately for some images the cropping from the original image isn'