The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Convert plain text logs into dataframes

Converting plain text logs into dataframes is a common task in data processing and analysis. The exact approach depends on the structure of your logs, but typically involves parsing the logs into columns and then loading them into a dataframe (often using Python’s pandas library).

Here’s a general step-by-step guide and example to convert plain text logs into a pandas dataframe:


1. Understand the Log Format

Logs might look like this (example):

pgsql
2025-05-18 10:15:23 INFO User login user_id=1234 ip=192.168.0.1 2025-05-18 10:16:05 ERROR Failed to load resource resource_id=5678 2025-05-18 10:18:45 WARN Disk space low disk_free=500MB

Each log line has:

  • Timestamp

  • Log level (INFO, ERROR, WARN)

  • Message

  • Optional key=value pairs


2. Parse Logs with Python

You can use regex or simple string operations to extract fields.

python
import pandas as pd import re log_lines = [ "2025-05-18 10:15:23 INFO User login user_id=1234 ip=192.168.0.1", "2025-05-18 10:16:05 ERROR Failed to load resource resource_id=5678", "2025-05-18 10:18:45 WARN Disk space low disk_free=500MB" ] # Define a regex pattern to capture timestamp, level, message, and key=value pairs pattern = re.compile(r'^(?P<timestamp>S+ S+) (?P<level>S+) (?P<message>.*?)(?P<kv_pairs>(?: S+=S+)*)$') data = [] for line in log_lines: match = pattern.match(line) if match: d = match.groupdict() # Parse key-value pairs into a dict kv_str = d.pop('kv_pairs').strip() kv_dict = {} if kv_str: pairs = kv_str.split(' ') for pair in pairs: if '=' in pair: k, v = pair.split('=', 1) kv_dict[k] = v d.update(kv_dict) data.append(d) df = pd.DataFrame(data) print(df)

3. Resulting DataFrame

timestamplevelmessageuser_idipresource_iddisk_free
2025-05-18 10:15:23INFOUser login1234192.168.0.1NaNNaN
2025-05-18 10:16:05ERRORFailed to load resourceNaNNaN5678NaN
2025-05-18 10:18:45WARNDisk space lowNaNNaNNaN500MB

4. Adjust to Your Logs

  • Modify the regex pattern based on your exact log format.

  • Extract other fields as needed.

  • Convert columns (e.g., timestamp) to datetime types using pd.to_datetime().


If you provide a sample of your logs, I can tailor the code specifically for your format.

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About