The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Design a File Compression System for Interviews

File Compression System Design

A file compression system is designed to reduce the file size while maintaining the original content. The system should be capable of compressing both individual files and collections of files (directories) and can operate in lossless or lossy modes, depending on the file type and use case. This design follows object-oriented principles to ensure scalability, extensibility, and maintainability.


1. Requirements

Functional Requirements

  • Compression Algorithms: The system should support common compression algorithms, such as:

    • Lossless: Huffman Encoding, LZW (Lempel-Ziv-Welch), DEFLATE, and Brotli.

    • Lossy: JPEG (for images), MP3 (for audio), and H.264 (for video).

  • Decompression: The system must provide functionality to decompress files and restore them to their original form.

  • Multiple File Support: Users should be able to compress and decompress multiple files or entire directories at once.

  • File Metadata: The system should retain file metadata (e.g., file name, size, creation/modification date).

  • Error Handling: The system should handle cases like file corruption during compression or decompression.

Non-Functional Requirements

  • Performance: The system should efficiently handle large files and multiple file compressions without significant delays.

  • Scalability: It should scale to handle directories with large numbers of files or files with substantial size.

  • Extensibility: New algorithms or features (such as cloud storage integration or multi-threaded compression) should be easily addable in the future.


2. Key Components & Class Design

CompressionContext (Facade)

  • This class will act as the entry point to the compression system, providing users with simple APIs to compress and decompress files or directories.

CompressionStrategy (Strategy Pattern)

  • An abstract class or interface that defines the compress() and decompress() methods. This can be extended by concrete compression strategies for different algorithms.

  • Concrete Implementations:

    • HuffmanCompression

    • LZWCompression

    • DEFLATECompression

    • JPEGCompression

    • MP3Compression

FileHandler (Utility Class)

  • Manages reading and writing files, including handling both the original and compressed formats.

  • Handles file metadata storage (e.g., file size, timestamps).

CompressionManager (Invoker)

  • Coordinates the compression process. It selects the appropriate algorithm based on the file type and triggers the compression using the corresponding strategy.

CompressionFactory (Factory Pattern)

  • Creates instances of different compression strategies based on the file type or user preferences.


3. Class Diagram Overview

pgsql
+---------------------------------+ | CompressionContext | |---------------------------------| | - compressionManager: | | - strategyFactory: | |---------------------------------| | + compressFiles(): void | | + decompressFiles(): void | +---------------------------------+ | v +---------------------------------+ +------------------------+ | CompressionManager |<---->| CompressionStrategy | |----------------------------------| +------------------------+ | - strategy: CompressionStrategy | | + compress(): void | |----------------------------------| | + decompress(): void | | + compressFiles(): void | +------------------------+ | + decompressFiles(): void | | +---------------------------------+ v | +-----------------------------+ | | Concrete Compression | | | Implementations | | +-----------------------------+ v | - HuffmanCompression | +---------------------------------+ | - LZWCompression | | FileHandler | | - DEFLATECompression | |---------------------------------+ | - JPEGCompression | | + readFile(): File | | - MP3Compression | | + writeFile(): void | +-----------------------------+ | + getMetadata(): FileMetadata | +---------------------------------+ | v +---------------------+ | CompressionFactory | |---------------------| | + createStrategy() | +---------------------+

4. Sequence Diagram

Compressing Files:

  1. The user calls CompressionContext.compressFiles().

  2. CompressionContext delegates the compression task to the CompressionManager.

  3. CompressionManager queries CompressionFactory to get the appropriate compression strategy based on file types.

  4. The selected strategy (e.g., HuffmanCompression) compresses the files.

  5. FileHandler reads each file and compresses them using the chosen algorithm.

  6. The compressed file is written to the output location.

Decompressing Files:

  1. The user calls CompressionContext.decompressFiles().

  2. CompressionContext delegates the decompression task to the CompressionManager.

  3. CompressionManager determines the appropriate decompression strategy (based on file signature or metadata).

  4. FileHandler reads the compressed file and decompresses it using the selected strategy.


5. Class Details

CompressionContext

python
class CompressionContext: def __init__(self, compression_manager, strategy_factory): self.compression_manager = compression_manager self.strategy_factory = strategy_factory def compressFiles(self, files: List[str]): self.compression_manager.compressFiles(files) def decompressFiles(self, compressed_files: List[str]): self.compression_manager.decompressFiles(compressed_files)

CompressionManager

python
class CompressionManager: def __init__(self, strategy: CompressionStrategy): self.strategy = strategy def compressFiles(self, files: List[str]): for file in files: compressed_file = self.strategy.compress(file) FileHandler.writeFile(compressed_file) def decompressFiles(self, compressed_files: List[str]): for compressed_file in compressed_files: decompressed_file = self.strategy.decompress(compressed_file) FileHandler.writeFile(decompressed_file)

CompressionStrategy

python
from abc import ABC, abstractmethod class CompressionStrategy(ABC): @abstractmethod def compress(self, file: str) -> str: pass @abstractmethod def decompress(self, compressed_file: str) -> str: pass

HuffmanCompression (Concrete Strategy)

python
class HuffmanCompression(CompressionStrategy): def compress(self, file: str) -> str: # Huffman compression algorithm logic return "compressed_huffman_file" def decompress(self, compressed_file: str) -> str: # Huffman decompression algorithm logic return "decompressed_file"

FileHandler

python
class FileHandler: @staticmethod def readFile(file: str) -> str: # Read file content return file_content @staticmethod def writeFile(file: str): # Write compressed or decompressed file pass @staticmethod def getMetadata(file: str) -> dict: # Return file metadata (size, creation date) return {"size": 1024, "created_at": "2025-07-16"}

CompressionFactory

python
class CompressionFactory: @staticmethod def createStrategy(file_type: str) -> CompressionStrategy: if file_type == "text": return HuffmanCompression() elif file_type == "image": return JPEGCompression() else: return DEFLATECompression()

6. Conclusion

The File Compression System design follows solid object-oriented principles like the Strategy Pattern for flexible algorithm management, Factory Pattern for easy extension of compression strategies, and Facade Pattern to simplify user interaction. This approach ensures that new algorithms or features can be integrated easily without disrupting the overall system architecture. The system provides a scalable, efficient, and extensible solution for file compression and decompression.

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About