Omitted Variable Bias; Race and Wages; Econometrics
Alex Tabarrok at Marginal Revolution writes about Heckman et al.’s Journal of Law and Economics paper, “Labor Market Discrimination and Racial Differences in Premarket Factors”. The paper is clearly worth reading, but I got distracted by an econometric idea I had that might lead to a generally useful technique.
Suppose you regress Wage on IQ and Black and find both significant. Could the significance of Black just be because IQ tests are imperfect, and Black is accidentally correlated with Ability, the true cause of Wage, whereas IQ has a true association with Ability? In that case, adding lots more variables would make Black’s effect get smaller and smaller, as more variables were added that were acccidentally correlated with Ability, but IQ would stay as important as before. It might be like this:
True Equation (1): Ability = IQ + Creativity + Wisdom + Trickiness
True Equation (2): Wage = Ability
Trickiness is inversely correlated with Black, and also correlated with City and Occupation and Jailed. I guess it should be correlated with IQ too, really.
Regression (1): Wage = IQ + Black
Regression (2) Wage = IQ + Black + City + Occupation + Jailed
If we run (2) instead of (1), the coefficients on both IQ and BLack will shrink. IQ will maybe get more significant; Black will get less significant.
This strategy is interesting. Throw in the kitchen sink, but don’t look at all the new variables for significance. Rather, hope they kill off some of your old variables.Any that survive really are important.