The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Scrape Wikipedia tables

To scrape Wikipedia tables, you can use Python with the pandas library, which makes the task straightforward since many Wikipedia tables are formatted as HTML tables. Below is a basic script to do this:

python
import pandas as pd # Wikipedia URL with a table url = 'https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)' # Read all tables from the page tables = pd.read_html(url) # Inspect the number of tables found print(f"Found {len(tables)} tables.") # Display the first table (or another based on index) print(tables[0]) # Change index to select another table

Notes:

  • pandas.read_html() uses lxml or html5lib behind the scenes, so install them if needed:

    bash
    pip install pandas lxml html5lib
  • Some pages have multiple tables; use indexing (tables[0], tables[1], etc.) to select the desired one.

  • Once you have the table, you can save it or manipulate it with pandas:

python
# Save to CSV tables[0].to_csv("gdp_table.csv", index=False)

If you want help scraping a specific Wikipedia table, just share the URL or describe the table.

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About