Categories We Write About

How to Study the Effects of Global Supply Chain Disruptions Using EDA

To study the effects of global supply chain disruptions using Exploratory Data Analysis (EDA), it’s important to follow a systematic approach to analyze data and uncover patterns, trends, and insights that help understand the impacts of disruptions. Here’s a structured approach to leveraging EDA for this type of study:

1. Data Collection

Before any analysis can begin, you need to gather relevant data. For studying supply chain disruptions, you would typically need data from various sources such as:

  • Global trade data: This could include trade volumes, shipping times, port congestion, customs delays, and transportation routes.

  • Supply chain performance metrics: Data such as on-time delivery rates, order fulfillment times, and inventory turnover.

  • Economic indicators: Global GDP, manufacturing indices, inflation rates, and trade tariffs can all impact supply chains.

  • Logistics data: Shipping routes, delivery delays, and freight rates are crucial for understanding disruptions in the flow of goods.

  • Supplier and manufacturer data: Performance data from suppliers, manufacturers, and distributors can provide insights into disruptions at various levels of the supply chain.

Sources for the data:

  • Government trade databases (e.g., UN Comtrade)

  • Industry reports and surveys (e.g., World Trade Organization, World Bank)

  • Shipping and logistics companies’ databases

  • Company-specific ERP or supply chain management systems

  • Financial market data and indices

2. Data Preprocessing

Once data is collected, preprocessing is essential to ensure its quality and consistency. This stage involves:

  • Data cleaning: Remove missing, duplicate, or irrelevant data points that could distort the analysis.

  • Data transformation: Convert data into a format that is suitable for analysis, such as transforming timestamps into more meaningful formats (e.g., days, months, years), normalizing values, or aggregating data at different time intervals (daily, monthly, quarterly).

  • Handling missing values: Use imputation techniques if there are missing values, or remove the corresponding records if appropriate.

  • Outlier detection: Identify and handle outliers which could indicate errors or exceptional cases.

3. Exploratory Data Analysis (EDA)

EDA is about examining the data through visual and statistical methods to gain insights and develop hypotheses. Here are some key steps to perform during EDA:

a. Univariate Analysis

Start by analyzing individual variables to understand their distribution, central tendency, and variability.

  • Distribution Analysis: Plot histograms or kernel density estimates (KDE) for each variable (e.g., shipping times, lead times, sales volumes) to understand the shape of the distribution.

  • Descriptive Statistics: Calculate measures such as mean, median, mode, standard deviation, and percentiles to get a sense of central tendency and spread.

b. Bivariate and Multivariate Analysis

After understanding individual variables, it’s important to explore relationships between different variables.

  • Correlation analysis: Use heatmaps or scatter plots to examine correlations between key variables, such as global trade volume and shipping delays. Look for strong positive or negative correlations.

  • Pair plots: These allow you to visually inspect relationships between multiple variables at once, which can help identify trends or anomalies.

  • Cross-tabulation: This method is useful to check relationships between categorical variables, like supplier location and delivery delays, for example.

  • Time series analysis: Since supply chain disruptions are often time-dependent, plot time series graphs to study patterns such as seasonal fluctuations, cyclical behavior, or trends over time (e.g., shipping delays before and after a natural disaster).

c. Data Visualization

Effective visualization is a key component of EDA. Use the following techniques to reveal important patterns:

  • Box plots: These are useful for identifying the distribution of variables and detecting outliers (e.g., shipping delays).

  • Bar charts and pie charts: Good for visualizing categorical variables, such as the frequency of disruptions in different regions or product categories.

  • Heatmaps: Use them to visualize the correlation matrix and understand which variables are highly correlated.

  • Geospatial analysis: If data includes geographical locations (e.g., ports, factories), use maps to visualize disruptions and identify geographically impacted areas.

d. Clustering Analysis

To better understand the underlying structure of the supply chain disruptions, you can perform clustering using methods such as k-means clustering or hierarchical clustering. This can help identify patterns in the data that are not immediately obvious, such as clusters of countries with similar disruption characteristics or regions experiencing similar types of logistical problems.

e. Outlier Detection

Outliers in the data could represent either errors or interesting anomalies that are crucial for understanding disruptions. For instance, a sudden spike in shipping delays might be due to a natural disaster or labor strike. Use methods like Z-scores or IQR (Interquartile Range) to detect and visualize outliers.

4. Hypothesis Testing and Statistical Analysis

Once you’ve uncovered patterns in the data, you can begin to test hypotheses about supply chain disruptions. For example:

  • Does the disruption in supply chains lead to an increase in lead times?

  • Are certain regions more vulnerable to disruptions?

  • What is the impact of global economic factors (e.g., GDP, trade tariffs) on shipping delays?

You can use statistical tests such as t-tests, chi-square tests, or ANOVA to validate these hypotheses. For time series data, statistical tests like the Augmented Dickey-Fuller (ADF) test can help you check for stationarity, which is important for forecasting.

5. Identifying Key Drivers of Disruptions

Using the insights gathered from EDA, you can focus on identifying the primary drivers of disruptions. These could include:

  • Supply-side disruptions: Issues with suppliers, such as factory shutdowns, labor strikes, or natural disasters.

  • Demand-side disruptions: Changes in consumer demand due to economic downturns, pandemics, or shifts in market preferences.

  • Logistical disruptions: Problems in transportation networks, such as port congestion, shipping container shortages, or disruptions in air freight.

  • Regulatory disruptions: Tariffs, trade policies, and international sanctions that impact cross-border trade.

6. Modeling and Forecasting

While EDA gives you a deep understanding of the data, you can move beyond exploration by using predictive models to forecast the impacts of supply chain disruptions. You can use machine learning techniques like:

  • Linear regression: To model the relationship between disruptions and key performance metrics.

  • Time series forecasting models (ARIMA, SARIMA): These are useful for predicting future disruptions based on historical data.

  • Classification models (Decision Trees, Random Forests): These can help identify factors that lead to a high probability of disruption (e.g., delays or failures).

7. Interpret Results and Draw Conclusions

After performing all the necessary analyses, the final step is to interpret the results. Provide actionable insights that can help businesses or governments better manage future disruptions. These could include:

  • Recommendations for diversifying suppliers to mitigate risk.

  • Identifying geographical hotspots of disruption.

  • Suggesting logistical improvements based on the insights gained from transportation delays.

8. Communicate Findings

Once insights are gathered, effectively communicating the findings is key to influencing decision-making. Use clear and concise visualizations, such as dashboards or reports, to showcase your findings in an easily digestible format. Ensure that you tailor your communication to the audience, be it supply chain managers, executives, or policymakers.


By following these steps, you can leverage EDA to study the effects of global supply chain disruptions comprehensively, identify key drivers, and provide data-backed recommendations for mitigating future disruptions.

Share This Page:

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About