Handling file compression and decompression is a common task in programming, and Python makes it straightforward with built-in modules to zip and unzip files efficiently. Whether you want to archive multiple files into a single compressed package or extract files from an existing zip archive, Python’s tools provide flexible solutions to get the job done.
Understanding ZIP Files
A ZIP file is an archive format that compresses one or more files into a single container, reducing their overall size and making file transfers easier. It supports lossless compression, preserving the original file data. Python’s zipfile
module allows both creating and extracting ZIP archives, with support for reading, writing, and appending ZIP files.
Creating ZIP Files in Python
To zip files, you can use the zipfile.ZipFile
class. You open a new ZIP archive in write mode and add files one by one.
Example: Zipping Multiple Files
-
The
'w'
mode opens the archive for writing, creating a new ZIP file or overwriting an existing one. -
zipf.write(file, arcname)
adds each file.arcname
defines the name stored inside the ZIP archive. -
Using
os.path.basename
strips directory paths, storing only the file name.
Adding Files to an Existing ZIP Archive
To append files without overwriting:
Compressing an Entire Directory
To zip an entire folder with subdirectories:
-
os.walk()
traverses all subdirectories. -
os.path.relpath()
stores files with relative paths inside the archive. -
ZIP_DEFLATED
enables compression.
Extracting Files from a ZIP Archive
Unzipping files is just as simple. Open the ZIP file in read mode and extract all or specific files.
Extract all files:
This extracts every file into the specified folder, creating it if necessary.
Extracting Specific Files
To extract one file at a time:
Listing Contents of a ZIP Archive
Before extraction, you may want to see what’s inside:
namelist()
returns a list of all files inside the ZIP archive.
Handling Password-Protected ZIP Files
Python’s zipfile
module supports extraction from password-protected ZIPs (since Python 3.7), but not creating encrypted ZIP files.
The password must be passed as bytes.
Performance and Compression Types
The default compression is ZIP_STORED
(no compression). To enable compression, use ZIP_DEFLATED
. Other methods like ZIP_BZIP2
and ZIP_LZMA
provide higher compression but require Python 3.3+.
Example:
Using shutil
for Simple Zip Operations
For quick zipping without detailed control, shutil
offers a high-level API:
This creates backup.zip
containing everything from the folder_to_zip
directory.
To unzip:
Summary
-
Use
zipfile.ZipFile
for detailed control over creating and extracting ZIP archives. -
Use modes
'w'
,'a'
, and'r'
for writing, appending, and reading respectively. -
Compress entire directories by walking folder structures.
-
Extract files either fully or selectively.
-
For quick tasks,
shutil.make_archive
andshutil.unpack_archive
simplify zipping/unzipping. -
Python supports extracting password-protected ZIP files but not creating them.
-
Always handle paths carefully to maintain archive structure or simplify extraction.
Mastering these techniques allows efficient file storage, transfer, and management within your Python projects.
Leave a Reply