SAS Programming: Computing Statistics Using “Proc Means”

The “proc means” procedure can be used to generate summary statistics for numerical data.   Typing the following code generates the count, mean, standard deviation, minimum and maximum values for the diabetes data set.

Note:  This dataset can be downloaded by clicking on “help” and selecting “learning sas programming” from the SAS dropdown menu.

The following code limits the output to the minimum and maximum values for the data in the “diabetes” data set.   There are many other keywords that specify statistics that can be displayed by the proc means procedure, but we will limit this example to a simple minimum and maximum.  The last option also limits the number of decimals that the output displays which cleans up the output.

There may be instances when one wants to select only specific variables to view.  The proc means command can restrict the number of variables through the “var” option.

In addition to listing variables you can use a numbered range of variables.

There are other times when one needs statistics for a group of observations instead of the entire dataset.  This can be accomplished by using the “class” statement in the means procedure.

Like the “class” statement, the “by” statement also groups statistics categorically. The only difference is that the “by” statement requires that the dataset be sorted with the sort procedure before running the means procedure.

The “class” statement is easier to use when categories contain only a few levels; the “by” command has the advantage when categorical data with many levels are to be summarized.

One an also created summarized datasets using the means procedure using the “output out” statement and specifying the statistics/names of the summarized variables in the data set.