The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Create a content archive with Python

Creating a content archive with Python typically means collecting, organizing, and storing content—such as files, articles, or other digital assets—into a structured format for later access or retrieval. Below is a basic yet flexible example of how to build a content archive using Python.


Step-by-Step Guide: Building a Content Archive with Python

1. Define the Archive Structure

We’ll create a directory-based archive where each content item is stored in its own folder with metadata and content files.

python
import os import json from datetime import datetime ARCHIVE_ROOT = 'content_archive' def init_archive(): if not os.path.exists(ARCHIVE_ROOT): os.makedirs(ARCHIVE_ROOT) print(f"Archive initialized at '{ARCHIVE_ROOT}'")

2. Create and Save Content

Each content item will include:

  • A metadata.json file (title, tags, date)

  • A content.txt file for the main body

python
def add_content(title, tags, content): timestamp = datetime.now().strftime('%Y%m%d_%H%M%S') slug = title.lower().replace(' ', '_')[:30] folder_name = f"{timestamp}_{slug}" content_path = os.path.join(ARCHIVE_ROOT, folder_name) os.makedirs(content_path, exist_ok=True) metadata = { 'title': title, 'tags': tags, 'created_at': timestamp } with open(os.path.join(content_path, 'metadata.json'), 'w') as meta_file: json.dump(metadata, meta_file, indent=4) with open(os.path.join(content_path, 'content.txt'), 'w') as content_file: content_file.write(content) print(f"Content '{title}' saved in '{content_path}'")

3. List Archived Content

This function retrieves all archived items with their metadata.

python
def list_contents(): for folder in os.listdir(ARCHIVE_ROOT): folder_path = os.path.join(ARCHIVE_ROOT, folder) meta_file = os.path.join(folder_path, 'metadata.json') if os.path.isfile(meta_file): with open(meta_file, 'r') as f: meta = json.load(f) print(f"Title: {meta['title']}, Tags: {meta['tags']}, Date: {meta['created_at']}")

4. Search Content by Tag or Title

Allow searching based on metadata fields.

python
def search_contents(keyword): results = [] for folder in os.listdir(ARCHIVE_ROOT): folder_path = os.path.join(ARCHIVE_ROOT, folder) meta_file = os.path.join(folder_path, 'metadata.json') if os.path.isfile(meta_file): with open(meta_file, 'r') as f: meta = json.load(f) if keyword.lower() in meta['title'].lower() or keyword.lower() in [t.lower() for t in meta['tags']]: results.append(meta) return results

5. View Specific Content

Display content and metadata of a given item.

python
def view_content(title_keyword): for folder in os.listdir(ARCHIVE_ROOT): folder_path = os.path.join(ARCHIVE_ROOT, folder) meta_file = os.path.join(folder_path, 'metadata.json') content_file = os.path.join(folder_path, 'content.txt') if os.path.isfile(meta_file): with open(meta_file, 'r') as f: meta = json.load(f) if title_keyword.lower() in meta['title'].lower(): with open(content_file, 'r') as cf: body = cf.read() print(f"Title: {meta['title']}") print(f"Tags: {meta['tags']}") print(f"Date: {meta['created_at']}") print("Content:n", body) break

Example Usage

python
init_archive() add_content( title="Understanding Python Decorators", tags=["python", "programming", "decorators"], content="Decorators in Python are functions that modify the behavior of another function..." ) add_content( title="SEO Tips for Content Creators", tags=["seo", "marketing", "content"], content="Search Engine Optimization (SEO) is essential for visibility in search results..." ) list_contents() results = search_contents("seo") print("Search Results:", results) view_content("SEO Tips")

Additional Enhancements

  • Full-text search: Use a database or indexing tool like SQLite or Whoosh.

  • GUI/CLI interface: Add a command-line menu or Tkinter interface.

  • Export/Backup: Zip folders for backup or export.

  • Web interface: Build a Flask or FastAPI app for web-based archive access.

This basic structure gives you a scalable and editable archive system for managing content effectively using only Python and local files.

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About