The Wald, LM, LR, and Hausman Tests
I’ve been trying to connect the Hausman Test to the Wald, Likelihood Ratio (LR) and Lagrange-Multiplier (LM) statistics. Here are my notes, which will interest only people who know quite a bit of statistics. Don’t trust them too far– these are the possibly confused ideas of a non-expert. But they might be helpful.
Very generally in statistics, we want to test an unrestricted model against a restricted model. The unrestricted model is the one which puts the fewest assumptions on the model, but hence is least powerful. Usually you can get a consistent estimate of the parameters, but not an efficient one, if certain restrictions would be valid. The restricted model makes more assumptions. It will have lower “likelihood”, but be more efficient.
Example 1: I run the unrestricted regression
y = beta*x1 + gamma*x2
and the restricted regression (the restriction being “gamma=0″)
y = beta*x1.
The second regression is more efficient if gamma really does equal zero, but is biased otherwise.
Example 2: I run the unrestricted regression
y = beta*x1 + gamma*x2 (IV)
using instrumental variables because I think x1 is endogenous, being partly caused by y. If I do this, IV will be consistent, but it is unnecessary if the restriction is added taht x1 is exogenous. Then my restricted regression is
y = beta*x1 +gamma*x2 (OLS)
The second regression is more efficient if x1 is not endogenous, but biased otherwise.
The Wald, LR, and LM tests are all ways to test the restrictions. All of them use the idea that if the restriction is unnecessary, then in large samples the restricted and unrestricted estimates will be about the same.
How do you test that? The idea of the Wald statistic is that you do a test for the hypothesis that the unrestricted parameter estimates equal particular values– the values you get from the restricted estimate. To test whether several parameter estimates equal particular values we use a Chi-Squared test. We need standard errors of the estimates for this, and we get them from our unrestricted estimate.
In Example 1, you would run the unrestricted regression, including x2, and use the standard error from the unrestricted regression to see if the parameter on x2 equals 0. That amounts to a z-test when we’re testing just one parameter, but we could use a chi-squared test to get the same answer.
The Hausman Test uses the Wald approach in Example 2. You do the IV estimate, which is consistent, and see if the parameter estimates are close to the ones you get from the OLS estimate, using the estimated variances from the IV regression.
Note that our null hypothesis is that the unrestricted estimate is valid, and we are seeing if the difference in estimated parameters is too large to make that probable, so we have to include x2 in the regression, or use IV.
The LM test does the same thing, but using the estimated variances from the restricted estimate– the OLS one, in Example 2, or the regression without x2, in Example 1. The LM chi-squared statistics comes out smaller asymptotically, because it uses the more efficient (if valid) restricted estimate.
The LR test takes a different approach entirely. It compares the Likelihood of seeing this data given the estimated parameters when the estimates are chosen to maximize likelihood first under no constraints and second under the restrictions. If the restrictions are valid, the likelihoods are about the same.