Categories We Write About

Merge data from multiple Excel files

To merge data from multiple Excel files, you can use different methods depending on the tools you are comfortable with. Below are methods for merging Excel files using Microsoft Excel, Python (pandas), and Power Query.

1. Using Microsoft Excel

If you have a few files and don’t want to write code, Excel itself can be a good tool to merge the data:

Method 1: Copy and Paste

  • Open all the Excel files you want to merge.

  • Copy the data from one file and paste it into the other file where you want to consolidate everything.

  • If the files have the same structure, paste them below the existing data.

Method 2: Power Query

Power Query allows you to load data from multiple files and merge them automatically. Here’s how you can do that:

  1. Open Excel and go to the Data tab.

  2. Click on Get Data, then choose From FileFrom Folder.

  3. Browse to the folder where the files are located and select OK.

  4. A dialog will appear showing all the files in the folder. Click on Combine and then Combine & Load.

  5. Power Query will automatically combine the files. You can then transform or clean the data as needed before loading it into a new sheet.

2. Using Python (pandas)

For more control and automation, especially when working with large datasets or needing to repeat the process, Python with the pandas library is an efficient option.

Steps:

  1. Install pandas (if you don’t have it installed yet):

    bash
    pip install pandas
  2. Use pandas to merge Excel files:

    python
    import pandas as pd import os # Specify the folder where your Excel files are located folder_path = 'path_to_your_folder' # List all Excel files in the folder excel_files = [f for f in os.listdir(folder_path) if f.endswith('.xlsx')] # Initialize an empty list to store dataframes df_list = [] # Loop through the files and read them into pandas DataFrames for file in excel_files: file_path = os.path.join(folder_path, file) df = pd.read_excel(file_path) df_list.append(df) # Concatenate all the DataFrames into one merged_df = pd.concat(df_list, ignore_index=True) # Save the merged data into a new Excel file merged_df.to_excel('merged_output.xlsx', index=False)

    This script will:

    • Search for all Excel files (.xlsx) in the specified folder.

    • Read each file into a pandas DataFrame.

    • Merge them into a single DataFrame.

    • Save the merged data into a new Excel file called merged_output.xlsx.

3. Using Power BI (for more advanced users)

Power BI can be another tool for merging Excel files, especially if you need to create reports or visualize the data afterward.

  1. Open Power BI Desktop.

  2. Click on Get Data > Folder.

  3. Choose the folder where the Excel files are stored.

  4. Power BI will load a list of all files in the folder. You can then select Combine to merge them.

  5. Once merged, you can transform the data using Power Query and load it into the report view.

4. Using Google Sheets

If you prefer cloud-based solutions:

  1. Upload all your Excel files to Google Drive and convert them into Google Sheets.

  2. Use the IMPORTRANGE function to pull data from different sheets.

  3. Combine data from different sheets into a master sheet by referencing all other sheets with IMPORTRANGE or by using QUERY for more advanced merging.


Choose the method based on the volume of data, your familiarity with the tools, and the complexity of the merging process. For automation and larger datasets, Python with pandas is usually the best choice. If you prefer a GUI, Power Query in Excel is very efficient for merging multiple files.

Share This Page:

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Categories We Write About