Categories We Write About

How to Study the Relationship Between Social Mobility and Income Using EDA

Studying the relationship between social mobility and income using Exploratory Data Analysis (EDA) involves a structured approach that includes collecting relevant data, cleaning and preprocessing it, and applying statistical and visualization techniques to uncover patterns, trends, and correlations. This process provides insight into how income levels influence or are influenced by social mobility across different populations and regions.

Understanding the Concepts

Social Mobility refers to the movement of individuals or groups within or between layers or tiers in an open system of social stratification. It is often measured by comparing the socio-economic status of individuals with that of their parents or previous generations.

Income is typically a central metric in assessing social mobility, especially in terms of intergenerational income elasticity (IGE), which quantifies how much of a person’s income is determined by their parents’ income.

Step 1: Identifying and Collecting the Data

Start with datasets that capture both income levels and indicators of social mobility. Some commonly used sources include:

  • The U.S. Census Bureau

  • Opportunity Insights (Raj Chetty’s mobility data)

  • World Bank and OECD social indicators

  • Survey data (e.g., Panel Study of Income Dynamics – PSID)

Key variables to collect include:

  • Parental income

  • Individual income in adulthood

  • Education levels of parents and children

  • Geographic data (region, urban vs. rural)

  • Race, gender, and ethnicity

  • Occupation and employment status

Step 2: Data Cleaning and Preprocessing

Ensure the data is ready for analysis by handling:

  • Missing values: Decide whether to impute, fill, or remove them.

  • Outliers: Identify and decide how to handle unusually high or low income data.

  • Normalization: Adjust income variables for inflation and convert them into consistent units.

  • Categorical Encoding: Convert non-numeric data (e.g., education level, occupation) into numerical values using label encoding or one-hot encoding.

Step 3: Univariate Analysis

Start EDA with univariate analysis to understand the distribution of each variable:

  • Histograms of income levels and educational attainment.

  • Box plots to examine the spread and central tendencies.

  • Frequency tables for categorical data like education or occupation categories.

This gives an idea of how income and mobility indicators behave independently.

Step 4: Bivariate Analysis

To study the relationship between income and social mobility, perform bivariate analysis:

  • Scatter Plots: Visualize the correlation between parental income and child income.

  • Heatmaps: Use correlation matrices to understand the strength and direction of relationships.

  • Boxplots by Group: Compare income distributions across different social mobility strata (e.g., low vs. high parental education).

  • Line Charts: Show trends over time, such as how mobility indices change with income brackets.

These visual tools help detect linear and non-linear relationships.

Step 5: Multivariate Analysis

Explore the interplay of multiple variables simultaneously:

  • Pair Plots: Visualize interactions between three or more variables.

  • Multivariate Regression Analysis: Quantify how much variation in individual income can be explained by parental income, education, geography, etc.

  • Principal Component Analysis (PCA): Reduce dimensionality and identify major components driving the income-mobility relationship.

  • Decision Trees or Random Forests: Determine feature importance — how much each factor (parental income, education, etc.) contributes to future income.

Step 6: Geospatial Analysis

If the dataset includes location data:

  • Choropleth Maps: Show differences in social mobility across regions.

  • Bubble Maps: Indicate income or mobility level with bubble size.

  • Cluster Analysis: Identify regions with similar patterns of social mobility and income.

This provides a geographical dimension to the analysis, which is particularly useful for policy implications.

Step 7: Trend Analysis Over Time

Use time-series data to examine how the relationship between income and social mobility changes:

  • Line graphs to show changes in average income by mobility quintiles.

  • Cohort analysis: Compare different generations (e.g., Boomers vs. Millennials) to assess changes in IGE.

  • Rolling averages: Smooth short-term fluctuations to detect long-term trends.

Step 8: Identifying Mobility Traps and Opportunities

Use EDA to highlight:

  • Mobility traps: Areas or demographics where income remains low regardless of effort or education.

  • Opportunity zones: Regions or groups where upward mobility is more likely.

By analyzing cross-tabulations between education, income, location, and race, you can identify combinations that facilitate or hinder mobility.

Step 9: Hypothesis Testing and Statistical Significance

After identifying patterns:

  • Use t-tests or ANOVA to test whether differences in income across mobility groups are statistically significant.

  • Run chi-square tests for independence between categorical variables like education level and mobility category.

This strengthens the conclusions drawn from visual analysis and ensures observed differences are not due to chance.

Step 10: Insights and Policy Implications

Summarize key insights:

  • How strong is the correlation between parental income and adult outcomes?

  • Which variables most significantly affect mobility?

  • Are there regions with higher/lower mobility independent of income?

These findings can inform policy recommendations such as:

  • Investments in early childhood education

  • Income support for low-income families

  • Educational access and scholarship programs

  • Place-based interventions in low-mobility regions

Tools and Technologies

For EDA, consider using:

  • Python (pandas, matplotlib, seaborn, plotly, scikit-learn)

  • R (ggplot2, dplyr, caret)

  • Tableau or Power BI for interactive visualizations

  • Jupyter Notebooks for documenting analysis steps

Final Thoughts

EDA is a powerful approach to unravel complex relationships like that between social mobility and income. It provides both descriptive and inferential insights that can lead to more focused econometric modeling, forecasting, and data-driven policy making. Through effective visualization and statistical exploration, one can identify not just correlations but also the underlying dynamics that perpetuate or break cycles of income inequality and mobility.

Share This Page:

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About