The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Extract and log hashtags from Instagram exports

To extract and log hashtags from Instagram exports, you need to process the data exported from Instagram—typically in the form of a .json or .zip archive containing multiple .json files or text logs of posts, captions, comments, etc.

Here’s a clear step-by-step breakdown to extract and log hashtags using a Python script:


📦 1. Download Your Instagram Export

From Instagram:

  • Go to Settings > Privacy and Security > Data Download or Instagram’s Data Download page.

  • Request your data. Once downloaded, extract the .zip file.

  • Locate the relevant files:

    • media.json

    • comments.json

    • posts.json

    • Or content.json depending on the archive


🧾 2. Python Script to Extract Hashtags

python
import json import os import re # Define the directory where JSON files are stored data_folder = "path/to/your/instagram/data" # Regex pattern to find hashtags hashtag_pattern = re.compile(r"#w+") # Store extracted hashtags all_hashtags = [] def extract_hashtags_from_text(text): return hashtag_pattern.findall(text) # Loop through JSON files that may contain hashtags for file_name in os.listdir(data_folder): if file_name.endswith('.json'): file_path = os.path.join(data_folder, file_name) with open(file_path, 'r', encoding='utf-8') as f: try: data = json.load(f) if isinstance(data, list): for entry in data: for key in ['caption', 'text', 'title', 'comment']: if key in entry and isinstance(entry[key], str): hashtags = extract_hashtags_from_text(entry[key]) all_hashtags.extend(hashtags) elif isinstance(data, dict): for key, value in data.items(): if isinstance(value, str): hashtags = extract_hashtags_from_text(value) all_hashtags.extend(hashtags) except Exception as e: print(f"Error reading {file_name}: {e}") # Normalize and deduplicate unique_hashtags = sorted(set(tag.lower() for tag in all_hashtags)) # Log the results print("nExtracted Hashtags:") for tag in unique_hashtags: print(tag)

🧠 How It Works

  • Scans all .json files in the folder.

  • Searches for captions, texts, titles, or comments with hashtags.

  • Uses regular expressions to find hashtags like #example.

  • Deduplicates and logs the final list.


📝 Output Example

yaml
Extracted Hashtags: #coding #instagood #nature #photooftheday #travel

🔧 Optional Enhancements

  • Save to a .txt or .csv file

  • Track hashtag frequency:

python
from collections import Counter counter = Counter(all_hashtags) print(counter.most_common(10))
  • Export to Excel with pandas

Let me know if you’d like an advanced version that supports UI or batch processing across multiple user archives.

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About