We can often do data calculations/manipulations within observations,
but sometimes it is necessary to do calculations across observations.
The RETAIN statement is used to keep a specified value (assigned
by an INPUT or assignment
statement) from the current iteration of the DATA step to the next.
Otherwise, SAS automatically sets such values to missing before
each iteration of the DATA step. The RETAIN statement allows values
to be kept across observations; for example, computing a running
total of values, counting the number of occurrences of a variable's
value, setting indicators within a BY-group, and so on. RETAIN statements
are often used with FIRST.
and LAST. processing.
The RETAIN statement can be used to specify initial values for
variable(s) or elements of an array.
All elements or variables will be initialized to the specified value.
RETAIN <varlist> [initial-value(s)];
Varlist: specifies the names of the variables, lists or
arrays whose values you wish to retain.
Initial-value(s): the initial value(s) can be numeric or
character (e.g., 'y') and is assigned to all listed variables. If
the initial value is not specified, it is set to missing.
The following shows four variables specified in each retain statement.
RETAIN var1-var4 1; sets initial values of var1, var2, var3, var4 to 1.
RETAIN var1-var4 (1); only var1 is set to 1; var2-4 are set to missing.
RETAIN var1-var4 (1 2 3 4); OR
RETAIN var1-var4 (1,2,3,4); var1 is set to 1, var2 to 2, var3 to 3, var4 to 4.
For example, the statement RETAIN pop 1;
within a DATA step will assign a value of 1 to each observation
for the variable POP.
/*Use the retain statement to count the number of observations in
each BY group. An index weight is identified and each
subsequent weight is compared to the index. An example is
given for how the output might look*/
PROC SORT data=htwt_long;
by name age;
data w_compare (keep=name age index_weight weight over count
retain count index_weight;
if first.name then do;
/*Set and retain the first weight*/
/*Counter for number of records for each*/
/*Indicator variable for increased weight*/
over = (index_weight < weight);
PROC PRINT data=w_compare;
var name age weight over index_weight weight count;
title 'Retain Statement';
PRACTICE QUESTIONS ON
The following two questions assume that a permanent SAS data set
has been created from the sample clinical
data, including the format file. Examples are given for how
and output might look.
1. Create a BY-group by primary DX.
2. Count the number of observations in each BY-group using the
The following questions assume that a permanent SAS data set has
been created from the simulated Manitoba
health data available at MCHP. Examples are given for how program,
log and output
3. Use the ARRAY statement to find all the records with a diabetes
diagnosis (code 250). Hint: the SUBSTR
function is useful here.
VIc. Data Processing: By-Group Processing (First./Last.)