The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Scrape book titles from Goodreads shelves

To scrape book titles from Goodreads shelves, you can use Python with libraries like requests, BeautifulSoup, and optionally pandas for data organization. Note that Goodreads may block scraping activity, so always comply with their robots.txt and API terms.

Here’s a basic example to scrape book titles from a public Goodreads shelf:

python
import requests from bs4 import BeautifulSoup def scrape_goodreads_shelf(shelf_url, max_pages=1): headers = { "User-Agent": "Mozilla/5.0" } book_titles = [] for page in range(1, max_pages + 1): url = f"{shelf_url}?page={page}" response = requests.get(url, headers=headers) if response.status_code != 200: print(f"Failed to retrieve page {page}") continue soup = BeautifulSoup(response.text, 'html.parser') books = soup.select('a.bookTitle span') for book in books: title = book.get_text(strip=True) if title: book_titles.append(title) return book_titles # Example usage: shelf_url = 'https://www.goodreads.com/shelf/show/fantasy' # Change to any public shelf titles = scrape_goodreads_shelf(shelf_url, max_pages=3) for i, title in enumerate(titles, 1): print(f"{i}. {title}")

Notes:

  • Replace the shelf_url with the desired Goodreads shelf URL.

  • Adjust max_pages to scrape more pages.

  • Avoid too many requests in a short time to prevent IP blocking. Use time.sleep() between requests if needed.

Let me know if you’d like a version that saves to CSV or handles more advanced filtering.

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About