Previous research has found that there is a wage premium associated with marriage and/or belonging to a union. Marriage might make a man more likely to take work more serious thus increasing his desire for increased productivity and wages. Children might come into the picture and push a man to work harder for promotions and titles to provide for his children. Both of these arguments imply that marriage might cause a man to earn higher wages, but what if it was higher wages that made a man more likely to get married in the first place? The reverse causality problem arises in determining if marriage raises wages because it is very likely that financially established men are more attractive husbands. Thus the correlation between marriage and wages, is realized because successful men are more likely to get married than those who are struggling with their finances. Another thing to consider are unobserved or unmeasurable characteristics that can lead to both higher changes of being married and higher wages. Previous research indicates that people rated as above average looking tend to make more money. The methodology used in this post will address and account for these econometric and statistical issues and provided unbiased estimates of the marriage premium despite the concerns expressed above. Dynamic panel regression methods have been created especially do deal with these kinds of problems.

**According to my regression estimates, being married increases a man’s wages by about 16.4%**. The methodology used to derive this estimate is free from the reverse causality discussed earlier, serial correlation in the wage equation which might biased our estimates of the marriage premium, and and time invariant differences in men that might cause higher wages and higher probability of marriage.

Using a data from 651 individuals over the course of 7 years, found in * Introductory Econometrics: A Modern Approach*, this post will use dynamic panel methods to analyze the impact that marriage has on wages. This model will control for some of the problems and issues expressed in the previous paragraph by using lags as instruments in the famous Arellano-Bond Estimator.

Even though we have a panel data model, we can’t use fixed or random effects because wages have a high level of autocorrelation: their past values are highly correlated with future values. Since wages is going to be our independent variable, we can use it’s lag to correct for this problem. Technically, this model takes the first difference of the dependent variable on the first difference of explanatory variables AND the first difference of the lagged dependent variable as a regressor, as seen in the equation below:

The idea is that we can continue to add lags to eliminate heterogeneity, but we must find a limiting mechanism for the number of lags used as instrumental variables. There is a two step Generalized Method of Moments method to getting the correct number of lags, here are the steps:

- Modeling the variance-covariance matrix under homoskedasticity, but also incorporates the obvious serial correlation in the first difference of the error term.
- Using the residuals from the first step to optimize the weighting matrix which will be used to weight the second stage of the regression

To begin let’s estimate a pure time series model:

**Pure Time Series Model**

Notice that the output list 22 instruments, these are the lagged coefficients of the dependent variables across 6 years. There are only 6 years worth of data because first differencing eliminates one of the observations: At time 7 we can use all the previous 5 lags (in first difference) as instrumental variables. Similarly, at time 6 we can use 4 lagged values (in first difference) as instruments. We can see that the number 21 comes from adding up all the lagged terms used as instruments at each time t-1, 6+5+4+3+2+1=21. Also notice that the model used robust standard errors to capture the serial correlation in the estimate.

**Two Step General Method of Moments estimator**

Notice, that with the two step estimation the coefficient on lagged wages became smaller, but it’s statistical significance changed. Now for the interesting part, using what we did in the regression above but adding binary variables for union and marriage as strictly exogenous variables. This will get us a better estimate of how marriage impacts wages after controlling for the premiums granted to union members. Using this methodology will also account for serial correlation in the error term and unobserved heterogeneity.

In the regression above a two step GMM method was used called the Arellano-Bond Dynamic Panel-Data Estimation. The maximum number of lag for the independent variable was restricted to 2 so with 7 years of data the regression has 14 instruments in the form of lagged dependent variables. The standard errors are clustered by individual to provide a further buffer for heterogeneity or other serial correlation.

**According to my regression estimates, being married increases a man’s wages by about 16.4%**. The methodology used to derive this estimate is free from reverse causality because higher earning men are more also likely to be married, serial correlation in the wage equation which might biased our estimates of the marriage premium, and and time invariant differences in men that might cause higher wages and higher probability of marriage.