The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Build an article summarizer bot

Here’s a concise implementation of an Article Summarizer Bot using Python. It fetches an article from a URL, extracts the content, and summarizes it using natural language processing.

python
import requests from bs4 import BeautifulSoup from transformers import pipeline class ArticleSummarizerBot: def __init__(self): self.summarizer = pipeline("summarization", model="facebook/bart-large-cnn") def fetch_article(self, url): try: headers = {'User-Agent': 'Mozilla/5.0'} response = requests.get(url, headers=headers) soup = BeautifulSoup(response.text, 'html.parser') # Extract visible paragraphs paragraphs = soup.find_all('p') article_text = ' '.join([para.get_text() for para in paragraphs]) return article_text.strip() except Exception as e: return f"Error fetching article: {e}" def summarize(self, text, max_words=1300): try: # The summarizer has a token limit, split if text too long if len(text.split()) > max_words: text = ' '.join(text.split()[:max_words]) summary = self.summarizer(text, max_length=200, min_length=60, do_sample=False) return summary[0]['summary_text'] except Exception as e: return f"Error summarizing: {e}" def run(self, url): article = self.fetch_article(url) if article.startswith("Error"): return article return self.summarize(article) # Example Usage if __name__ == "__main__": bot = ArticleSummarizerBot() url = "https://example.com/sample-news-article" summary = bot.run(url) print("nSummary:n", summary)

Key Features:

  • Article Extraction: Pulls article content using requests and BeautifulSoup.

  • Summarization: Uses facebook/bart-large-cnn from Hugging Face Transformers.

  • Handles Long Articles: Trims content to fit model input limits.

Would you like a version of this as a web app or API endpoint as well?

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About