The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Build a file format detector

Here is a Python-based file format detector that can identify common file types by inspecting their file signatures (also known as magic numbers). This script reads the first few bytes of a file and matches them against known signatures.

python
import os # Define known file signatures FILE_SIGNATURES = { b'xFFxD8xFF': 'JPEG image', b'x89PNGrnx1an': 'PNG image', b'GIF87a': 'GIF image', b'GIF89a': 'GIF image', b'%PDF-': 'PDF document', b'x25x21': 'PostScript document', b'PKx03x04': 'ZIP archive', b'x1Fx8Bx08': 'GZIP archive', b'Rar!x1Ax07x00': 'RAR archive', b'x42x4D': 'BMP image', b'x49x44x33': 'MP3 audio', b'OggS': 'OGG audio/video', b'fLaC': 'FLAC audio', b'x00x00x00x18ftyp': 'MP4 video', b'x00x00x00x14ftyp': 'MP4 video', b'x00x00x01xBA': 'MPEG video', b'x00x00x01xB3': 'MPEG video', b'x52x49x46x46': 'AVI or WAV (needs more context)', b'x4Dx5A': 'Windows executable (EXE)', } def detect_file_format(file_path): try: with open(file_path, 'rb') as f: header = f.read(16) for signature, filetype in FILE_SIGNATURES.items(): if header.startswith(signature): return filetype return "Unknown file type" except FileNotFoundError: return "File not found" except IOError as e: return f"I/O error: {e}" # Example usage if __name__ == "__main__": path = input("Enter file path: ").strip() if os.path.isfile(path): result = detect_file_format(path) print(f"Detected file type: {result}") else: print("Invalid file path.")

Features:

  • Detects common formats: JPEG, PNG, PDF, MP3, ZIP, MP4, EXE, and more.

  • Uses file headers (magic numbers) for identification, not file extensions.

  • Lightweight and easily extendable with more signatures.

To Extend:

You can add more file signatures by expanding the FILE_SIGNATURES dictionary with known hex patterns.

Would you like a web interface or command-line enhancements for this detector?

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About