The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Parse GitHub contribution graphs

Parsing GitHub contribution graphs involves extracting and interpreting the data from the visual activity calendar on a user’s GitHub profile. This graph shows the number of commits made per day, typically over the past year, and is represented visually using colored squares where darker colors indicate higher activity.

Here’s a breakdown of how to parse GitHub contribution graphs programmatically:


1. Understanding the Graph Structure

GitHub’s contribution graph is rendered using SVG (Scalable Vector Graphics). Each day is a <rect> element with attributes that include:

  • data-date: the date (e.g., "2025-05-18")

  • data-count: number of contributions (e.g., "3")

  • fill: color value representing intensity (e.g., "#216e39")

Example:

html
<rect class="ContributionCalendar-day" data-date="2025-05-18" data-level="2" x="14" y="0" width="10" height="10" fill="#40c463" data-count="3"/>

2. Fetching the Graph Data

You can scrape the contribution graph using Python with requests and BeautifulSoup.

python
import requests from bs4 import BeautifulSoup username = "octocat" url = f"https://github.com/users/{username}/contributions" response = requests.get(url) soup = BeautifulSoup(response.content, "html.parser") contributions = [] for rect in soup.find_all("rect", {"class": "ContributionCalendar-day"}): date = rect.get("data-date") count = int(rect.get("data-count", 0)) contributions.append({"date": date, "count": count}) # Sample output print(contributions[:5])

3. Visualizing or Analyzing Parsed Data

Once you’ve extracted the data, you can analyze patterns or trends:

  • Total Contributions:

    python
    total = sum(day['count'] for day in contributions)
  • Most Active Day:

    python
    most_active = max(contributions, key=lambda x: x['count'])
  • Weekly or Monthly Aggregates:

    python
    import pandas as pd df = pd.DataFrame(contributions) df['date'] = pd.to_datetime(df['date']) weekly_summary = df.resample('W-Mon', on='date').sum()
  • Heatmap:
    Use libraries like matplotlib, seaborn, or plotly to generate a visual calendar-style heatmap.


4. Using GitHub API (Alternative Method)

GitHub’s public API doesn’t directly expose the contribution graph, but you can access commits and events via the API:

bash
curl https://api.github.com/users/octocat/events/public

Or use GraphQL to query contributions (for authenticated users):

graphql
query { user(login: "octocat") { contributionsCollection { contributionCalendar { totalContributions weeks { contributionDays { date contributionCount } } } } } }

5. Libraries for Parsing Contribution Graphs

  • GitHub Contribution Graph API Wrappers:

  • Visual Tools:

    • contribution-graph (Node.js)

    • GitHub Archive Data


6. Use Cases of Parsed Graph Data

  • Personal Dashboards: Show GitHub activity visually on personal websites.

  • Productivity Tracking: Correlate contributions with productivity or project milestones.

  • Recruiter Tools: Analyze coding consistency for developer profiles.

  • Gamification: Build gamified systems for tracking daily coding streaks.


7. Rate Limiting and Ethics

When scraping GitHub pages:

  • Respect robots.txt rules.

  • Avoid frequent or bulk scraping.

  • Use authenticated API access for intensive or automated tasks.


8. Conclusion

Parsing GitHub contribution graphs provides valuable insights into a developer’s coding patterns. Whether for personal metrics, team productivity tracking, or dashboard integration, extracting data from GitHub’s SVG-based graph or via APIs offers a powerful way to make contribution data actionable.

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About