Productivity Enhancing Job Training: Testing for Serial Correlation After IV Regression with Panel Data

Corporations and governments invest in training for their employees and citizens to improve their skills and employment opportunities.  Large sums of tax payer money are spent in training and re-training unemployed workers by state and federal authorities.  Private companies implicitly or explicitly spend money, time, and energy into training their employees to be more productive in the workplace. Past research has shown that increasing productivity also increases worker wages, which is one reason why many employees gravitate towards companies with reputations of excellent on the job training.

Correctly identifying the programs that work and those that don’t can greatly reduce cost while still funneling resources to improve the productivity of workers. In this post, I will calculate the effectiveness of computer training on the reduction of the scrap rate in a manufacturing plant. Using panel data from Introductory Econometrics:  A Modern Approach, this post will calculate the relationship between the hours of training an employee receives and their subsequent scrap rate.  One problem that can arise is that companies can self-select their employees for this training, and companies that are most likely to enroll their employees in this training might also be more likely to employ other policies that reduce scrap.  In order to correct for this problem I used whether a firm received a job training grant as an instrumental variable.  Exploiting panel data will remove some of the time fixed differences between companies, but a test for serial correlation in the error terms will be conducted to ensure that biased isn’t introduced into the estimates by using panel data.

First we need to tell STATA that the data set is a panel by specifying the time variable and the value which identifies an individual within the panel….

First Difference Regression

Second, we use the difference operator to regress the first difference of the scrap rate with the first difference of the hours of training received as the regressor…

The regression above shows that the reduction in scrap from training is statistically insignificant.  If our analysis stopped here we would conclude that the job training program has virtually no affect on the reduction of scrap for the manufacturing companies.   Since the hours of training that an employee receives might depend on whether or not a firm received grant money for the training and grant money might not be correlated with the error term in the scrap rate, it is a good candidate for an instrumental variable.

Test for Instrumental Variable Relevance

The regression above shows that receiving a grant is a strong instrument for hours of training..

Two State Least Square Regression (IV Regression)

The IV regression above shows that the impact of training is actually highly correlated with increased scrap rates.  This result is not statistically significant, but the question remains, did training increase the scrap rate in the companies participating?  Actually, it is more likely that companies that had higher scrap rates thought they would benefit from enrolling in the computer training program, and not that the job training made their employees worse.  We will return to this problem shortly, but a Hausman test for endogeneity needs to be run to justify the use of instrumental variables.

Hausman  Test for Endogeneity in Structural Equation

Next, take the residuals form the regression above…

Next, insert the residuals from the 2SLS regression into the original structural equation and check for the statistical significance of the residual variables…

Notice the the coefficient on the residuals is statistically insignificant form zero; this implies that there is endogeneity thus 2SLS was the appropriate regression model to use. Next issue would be to test for serial correlation in the error term; in other words, does it look like there was a time trend in the regression that might be biasing the results?

Finally, run the 2SLS model including the lagged residuals and check the statistical significance of the lagged residual…

The statistical significance of the lagged residual is non-existent so we can rule out serial correlation in our estimates.


The correct model to use was the 2SLS (IV Reg) regression for this data according to a Hausman test.  There was no evidence of serial correlation that could have been biasing our estimates of the impact of job training on scrap rates for companies that participated in the program.  The results suggest that participating in the job training program did not a have any impact on reducing scrap rates for companies who enrolled their employees to receive this extra instruction.  Valuable time an energy were spent on this program and the results were not impressive.  Diverting resources from these kind of unproductive programs would benefit society by achieving a more efficient allocation of resources. This analysis is in no way implying that government job training programs don’t work, but that in this particular case the outcomes fell short of what was expected.  Restructuring or redesigning the training process to the specific issues not addressed in the current curriculum might produce better results.