The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Create a web scraper for real-time headlines

Here’s a Python web scraper using requests and BeautifulSoup to get real-time headlines from a news website (for example, CNN). You can customize the URL and HTML selectors to fit the site you want.

python
import requests from bs4 import BeautifulSoup def get_real_time_headlines(url): headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ' + '(KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36' } response = requests.get(url, headers=headers) if response.status_code != 200: print(f"Failed to retrieve page: Status code {response.status_code}") return [] soup = BeautifulSoup(response.text, 'html.parser') # Example for CNN - headlines are in <h3> tags with class 'cd__headline' headlines_html = soup.find_all('h3', class_='cd__headline') headlines = [headline.get_text(strip=True) for headline in headlines_html] return headlines if __name__ == "__main__": url = "https://edition.cnn.com/world" # Change to your preferred news site headlines = get_real_time_headlines(url) for i, headline in enumerate(headlines, start=1): print(f"{i}. {headline}")

Notes:

  • Change url to the news website you want to scrape.

  • Adjust the tag and class in find_all based on the target site’s HTML structure.

  • Use User-Agent in headers to mimic a browser and avoid blocking.

  • This is a simple scraper — for more robust real-time data, consider APIs or headless browsers.

If you want me to write a scraper for a specific site, just let me know!

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About