The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Clean CSVs with a Python GUI

Creating a Python GUI to clean CSV files can be a practical tool for users who want to quickly preprocess their data without writing code. Below is a comprehensive guide and sample code for a simple but effective CSV cleaner with a graphical user interface using Tkinter and pandas.


Key Features for the CSV Cleaner GUI:

  • Load CSV files from disk

  • Display basic info (number of rows, columns)

  • Remove rows with missing values

  • Remove duplicate rows

  • Trim whitespace from string columns

  • Save cleaned CSV to a new file


Implementation Details

1. Required Libraries

  • tkinter for GUI

  • pandas for CSV manipulation

  • tkinter.filedialog for file open/save dialogs

python
import tkinter as tk from tkinter import filedialog, messagebox import pandas as pd

2. Core Functionality

  • Load CSV file into a pandas DataFrame

  • Apply cleaning operations based on user selections

  • Save the cleaned DataFrame back to CSV


Complete Example Code:

python
import tkinter as tk from tkinter import filedialog, messagebox import pandas as pd class CSVCleanerApp: def __init__(self, root): self.root = root self.root.title("CSV Cleaner") self.df = None # Buttons self.load_btn = tk.Button(root, text="Load CSV", command=self.load_csv) self.load_btn.pack(pady=5) # Info Label self.info_label = tk.Label(root, text="No file loaded") self.info_label.pack(pady=5) # Cleaning options self.dropna_var = tk.IntVar() self.duplicates_var = tk.IntVar() self.trim_var = tk.IntVar() tk.Checkbutton(root, text="Remove rows with missing values", variable=self.dropna_var).pack(anchor='w') tk.Checkbutton(root, text="Remove duplicate rows", variable=self.duplicates_var).pack(anchor='w') tk.Checkbutton(root, text="Trim whitespace in string columns", variable=self.trim_var).pack(anchor='w') # Clean Button self.clean_btn = tk.Button(root, text="Clean CSV", command=self.clean_csv, state='disabled') self.clean_btn.pack(pady=10) # Save Button self.save_btn = tk.Button(root, text="Save Cleaned CSV", command=self.save_csv, state='disabled') self.save_btn.pack(pady=5) def load_csv(self): file_path = filedialog.askopenfilename( filetypes=[("CSV files", "*.csv"), ("All files", "*.*")] ) if file_path: try: self.df = pd.read_csv(file_path) self.info_label.config(text=f"Loaded '{file_path.split('/')[-1]}': {self.df.shape[0]} rows, {self.df.shape[1]} columns") self.clean_btn.config(state='normal') self.save_btn.config(state='disabled') except Exception as e: messagebox.showerror("Error", f"Failed to load CSV:n{e}") def clean_csv(self): if self.df is None: messagebox.showwarning("No file", "Please load a CSV file first.") return df_clean = self.df.copy() if self.dropna_var.get(): df_clean = df_clean.dropna() if self.duplicates_var.get(): df_clean = df_clean.drop_duplicates() if self.trim_var.get(): for col in df_clean.select_dtypes(include=['object']).columns: df_clean[col] = df_clean[col].str.strip() rows_before = self.df.shape[0] rows_after = df_clean.shape[0] dropped = rows_before - rows_after self.df = df_clean self.info_label.config(text=f"Cleaned data: {rows_after} rows, {df_clean.shape[1]} columns. Dropped {dropped} rows.") self.save_btn.config(state='normal') def save_csv(self): if self.df is None: messagebox.showwarning("No data", "No data to save.") return save_path = filedialog.asksaveasfilename( defaultextension=".csv", filetypes=[("CSV files", "*.csv"), ("All files", "*.*")] ) if save_path: try: self.df.to_csv(save_path, index=False) messagebox.showinfo("Saved", f"Cleaned CSV saved to:n{save_path}") except Exception as e: messagebox.showerror("Error", f"Failed to save CSV:n{e}") if __name__ == "__main__": root = tk.Tk() app = CSVCleanerApp(root) root.mainloop()

How It Works

  • Click Load CSV to select a file.

  • The app shows the number of rows and columns.

  • Select any cleaning options you want:

    • Remove rows with missing data

    • Remove duplicates

    • Trim whitespace from text columns

  • Click Clean CSV to apply cleaning.

  • Click Save Cleaned CSV to export the cleaned data.


Additional Enhancements You Can Add:

  • Preview data inside the GUI

  • More cleaning options like filling missing values or changing data types

  • Support for large files with progress bars

  • Column-specific cleaning rules

This lightweight app helps non-technical users quickly clean CSV files via an intuitive interface. The core cleaning functions use pandas for robust data handling.

Would you like me to help you extend this with any specific features?

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About