The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Create a receipt scanner with OCR

Creating a receipt scanner with OCR involves combining image processing and optical character recognition to extract text from receipt images. Below is a detailed step-by-step guide on how to build a basic receipt scanner using Python with popular libraries such as OpenCV for image processing and Tesseract for OCR.


Step 1: Install Required Libraries

bash
pip install opencv-python pytesseract Pillow numpy

Additionally, you need to install Tesseract-OCR software on your machine:


Step 2: Basic Workflow Overview

  1. Image Acquisition: Capture or load a receipt image.

  2. Preprocessing: Improve image quality (grayscale, thresholding, noise removal).

  3. Text Extraction: Use Tesseract OCR to convert image to text.

  4. Postprocessing: Clean and structure the extracted data.


Step 3: Sample Code

python
import cv2 import pytesseract from PIL import Image import numpy as np # Path to the Tesseract executable (update this if needed) pytesseract.pytesseract.tesseract_cmd = r'C:Program FilesTesseract-OCRtesseract.exe' # Windows example def preprocess_image(image_path): # Load image in grayscale image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE) # Resize to improve OCR accuracy (optional) scale_percent = 150 width = int(image.shape[1] * scale_percent / 100) height = int(image.shape[0] * scale_percent / 100) dim = (width, height) resized = cv2.resize(image, dim, interpolation=cv2.INTER_AREA) # Apply Gaussian blur to reduce noise blurred = cv2.GaussianBlur(resized, (5,5), 0) # Apply adaptive thresholding to get binary image thresh = cv2.adaptiveThreshold(blurred, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 2) # Morphological operations to remove small noise kernel = np.ones((2,2), np.uint8) opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel) return opening def extract_text_from_image(preprocessed_image): # Invert image colors for Tesseract (white text on black background) inverted = cv2.bitwise_not(preprocessed_image) # Convert back to PIL Image for pytesseract pil_img = Image.fromarray(inverted) # OCR extraction custom_config = r'--oem 3 --psm 6' # LSTM OCR Engine, Assume a single uniform block of text text = pytesseract.image_to_string(pil_img, config=custom_config) return text def main(): image_path = 'receipt.jpg' # Path to your receipt image processed_image = preprocess_image(image_path) text = extract_text_from_image(processed_image) print("Extracted Text:") print(text) if __name__ == "__main__": main()

Explanation:

  • Preprocessing improves the clarity and contrast of the receipt, helping Tesseract perform better.

  • Adaptive thresholding turns the image into a high-contrast black-and-white image.

  • Morphological opening removes small noise pixels.

  • --oem 3 uses the latest OCR engine.

  • --psm 6 treats the image as a block of text.


Step 4: Improving Accuracy

  • Crop the receipt area from a larger photo using contour detection or manual cropping.

  • Use Tesseract’s --psm modes tailored to the receipt layout (e.g., --psm 4 for column-based text).

  • Train Tesseract with custom data for specific receipt fonts.

  • Use specialized libraries like EasyOCR or Google Vision API for better results.


Step 5: Structuring Extracted Data

After getting raw text, parse the text lines to extract key details such as:

  • Store name

  • Date and time

  • Items purchased and prices

  • Total amount

This can be done using regex and string matching.


If you want, I can also help you write a parser to structure receipt data or create a web/mobile app interface for this scanner. Just let me know!

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About