The Palos Publishing Company

Categories We Write About

Scrape all image alt texts from websites

Written by

in

Computer Science

To scrape all image alt texts from a website, you can use Python with the requests and BeautifulSoup libraries. Here’s a simple script to extract all alt attributes from <img> tags:

python
import requests
from bs4 import BeautifulSoup

def get_image_alt_texts(url):
    try:
        response = requests.get(url)
        response.raise_for_status()
        soup = BeautifulSoup(response.content, "html.parser")
        images = soup.find_all("img")
        alt_texts = [img.get("alt", "").strip() for img in images if img.get("alt")]
        return alt_texts
    except Exception as e:
        return f"Error: {e}"

# Example usage
url = "https://example.com"
alt_texts = get_image_alt_texts(url)
for i, alt in enumerate(alt_texts, 1):
    print(f"{i}. {alt}")

Requirements:

Install dependencies if you haven’t already:

bash
pip install requests beautifulsoup4

Notes:

This code only works on static HTML content. It won’t load JavaScript-rendered images.
If the website uses dynamic content, consider using Selenium or Playwright.
Make sure you comply with the site’s robots.txt and terms of service before scraping.

Do you want a version that works with JavaScript-heavy pages too?

Share This Page:

Comments

Leave a Reply Cancel reply

Check Out Our Newest Posts we wrote about

Categories We Write About