SAS Programming: Stacking Data with the Set Command

SAS has a command called “set” that can help stack data sets on top of one another.  This command maybe called when there are two sets of data from different time frames or divisions of a business that have the the same variables but different observations. If there is a variable in one data set that is not present in an another that are being stacked together then a missing value will be created.

The objective of this post is to stack two data sets that contain monthly observations for U.S. personal disposable income.  This data comes from the St. Louis Federal Reserve Bank’s FRED database:  This data was partitioned into two for the pursposes of this exercise.  One set of data consist of all observations before January 1980 and the remaining data set consists of all observations after that date.  The purpose of this post is to stack this data on top of each other so that they become one data set.


This first set of code merely bring in the two data sets and store them in the temporary work folder:

The SAS log shows that both data sets were imported correctly and tallies the variables and observations.  The first file labeled income contains 632 records while the second file contains 380 variables.

This next set of code contains the actual stacking of the data.  This code creates a temporary work file called “both”. Next, the “set” command is used to stack the files “income” and “income2” and the “by” command sorts the data by date.  Finally the “proc print” statement is used to output the data with the correct date format.  Using the “set” and “by” combination is called interleaving.  Interleaving is especially useful if each individual data set is sorted and stacking may undo this sorting.

The log shows that the two data sets were merged correctly.  There are a total of 1012 observations from the merger of “income” and “income 2”.



This new data set is now ready for analysis.