Handling file compression and decompression is a common task in programming, and Python makes it straightforward with built-in modules to zip and unzip files efficiently. Whether you want to archive multiple files into a single compressed package or extract files from an existing zip archive, Python’s tools provide flexible solutions to get the job done.
Understanding ZIP Files
A ZIP file is an archive format that compresses one or more files into a single container, reducing their overall size and making file transfers easier. It supports lossless compression, preserving the original file data. Python’s zipfile module allows both creating and extracting ZIP archives, with support for reading, writing, and appending ZIP files.
Creating ZIP Files in Python
To zip files, you can use the zipfile.ZipFile class. You open a new ZIP archive in write mode and add files one by one.
Example: Zipping Multiple Files
-
The
'w'mode opens the archive for writing, creating a new ZIP file or overwriting an existing one. -
zipf.write(file, arcname)adds each file.arcnamedefines the name stored inside the ZIP archive. -
Using
os.path.basenamestrips directory paths, storing only the file name.
Adding Files to an Existing ZIP Archive
To append files without overwriting:
Compressing an Entire Directory
To zip an entire folder with subdirectories:
-
os.walk()traverses all subdirectories. -
os.path.relpath()stores files with relative paths inside the archive. -
ZIP_DEFLATEDenables compression.
Extracting Files from a ZIP Archive
Unzipping files is just as simple. Open the ZIP file in read mode and extract all or specific files.
Extract all files:
This extracts every file into the specified folder, creating it if necessary.
Extracting Specific Files
To extract one file at a time:
Listing Contents of a ZIP Archive
Before extraction, you may want to see what’s inside:
namelist() returns a list of all files inside the ZIP archive.
Handling Password-Protected ZIP Files
Python’s zipfile module supports extraction from password-protected ZIPs (since Python 3.7), but not creating encrypted ZIP files.
The password must be passed as bytes.
Performance and Compression Types
The default compression is ZIP_STORED (no compression). To enable compression, use ZIP_DEFLATED. Other methods like ZIP_BZIP2 and ZIP_LZMA provide higher compression but require Python 3.3+.
Example:
Using shutil for Simple Zip Operations
For quick zipping without detailed control, shutil offers a high-level API:
This creates backup.zip containing everything from the folder_to_zip directory.
To unzip:
Summary
-
Use
zipfile.ZipFilefor detailed control over creating and extracting ZIP archives. -
Use modes
'w','a', and'r'for writing, appending, and reading respectively. -
Compress entire directories by walking folder structures.
-
Extract files either fully or selectively.
-
For quick tasks,
shutil.make_archiveandshutil.unpack_archivesimplify zipping/unzipping. -
Python supports extracting password-protected ZIP files but not creating them.
-
Always handle paths carefully to maintain archive structure or simplify extraction.
Mastering these techniques allows efficient file storage, transfer, and management within your Python projects.