The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Build a script to convert invoices to Excel

Here’s a Python script that converts PDF invoices into structured Excel sheets. This script assumes that invoices are in PDF format and contain tabular data (like item descriptions, prices, totals). It uses libraries like pdfplumber to extract data from PDFs and openpyxl to write to Excel.

Make sure to install the required libraries:

bash
pip install pdfplumber openpyxl

Python Script to Convert Invoices to Excel

python
import pdfplumber import os from openpyxl import Workbook def extract_invoice_data(pdf_path): extracted_data = [] with pdfplumber.open(pdf_path) as pdf: for page in pdf.pages: tables = page.extract_tables() for table in tables: for row in table: # Clean the data cleaned_row = [cell.strip() if cell else "" for cell in row] extracted_data.append(cleaned_row) return extracted_data def save_to_excel(data, output_path): wb = Workbook() ws = wb.active for row in data: ws.append(row) wb.save(output_path) def convert_invoices_to_excel(pdf_folder, output_folder): for filename in os.listdir(pdf_folder): if filename.lower().endswith(".pdf"): pdf_path = os.path.join(pdf_folder, filename) print(f"Processing: {pdf_path}") invoice_data = extract_invoice_data(pdf_path) if invoice_data: excel_filename = os.path.splitext(filename)[0] + ".xlsx" output_path = os.path.join(output_folder, excel_filename) save_to_excel(invoice_data, output_path) print(f"Saved to: {output_path}") else: print(f"No table data found in {filename}") # Set your input/output directories pdf_folder = "path/to/your/pdf_invoices" output_folder = "path/to/save/excel_files" # Run the conversion convert_invoices_to_excel(pdf_folder, output_folder)

Notes:

  • If the invoice layout varies a lot between files, you might need to fine-tune the extract_invoice_data function.

  • For scanned PDFs (images), OCR using pytesseract and pdf2image would be needed.

  • This works best with digital PDFs that contain selectable text.

Would you like a version for scanned invoices using OCR as well?

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About