Program 2: Arithmetic operators and SUBSTR function
The following program illustrates another two ways of using assignment statements.
New variables can be created using arithmetic operators to perform calculations
on existing variables. Because calculations are involved, this type of
assignment statement can only be used on numeric variables.
New variables can be created using the SUBSTR, or substring, function
to truncate values of existing variables. A practical application of this
function is its use with the ICD-9-CM diagnosis in Manitoba Health data, which
can range from 3 to 5 digits in length. The 4- and 5-digit values
are collapsible to 3 digits (e.g., acute myocardial infarction
is ICD-9-CM 410, with a number of sub-categories, e.g., 410.72,
which represents subendocardial infarction, subsequent episode).
If the labels are only available at the 3-digit level,
a new diagnosis variable can be created that reads in only the relevant 3 digits
of the diagnosis field.
The log and output
are also available for the following program, which assumes that the htwt
data set has already been created.
*************************************************
* file = assign.sas *
* The SAS program in this file creates new *
* variables using assignment statements: *
* several arithmetic operators and the *
* SUBSTR function. *
*************************************************;
options linesize=min;
*----------------------------------------*
* Create a new temporary SAS data set, *
* same name, to add new variables *
*----------------------------------------*;
data htwt;
set htwt;
*------------------------------------------------*
* 1. Create new variables that incorporate *
* calculations made on the existing variables *
*------------------------------------------------*;
/* Add 2 to "height" */
ht2p = (height + 2);
/* Subtract 1 from "age" */
agem1 = (age - 1);
/* Convert inches to feet */
htfeet = (height/12);
/* Use the ROUND function to
limit decimals to 1 */
htround = round(htfeet,0.1);
/* Convert pounds to ounces*/
wtounce = (weight*16);
*------------------------------------------------*
* 2. Create new variables that truncate the *
* values of existing variables. *
*------------------------------------------------*;
/* Starting at first value,
display first 2 values*/
name1 = substr(name,1,2);
/* Starting at first value,
display first 3 values*/
name2 = substr(name,1,3);
/* Starting at second value,
display first 3 values */
name3 = substr(name,2,3);
run;
proc freq data=htwt;
tables height * ht2p * htfeet * htround /list missing;
tables age * agem1 /list missing;
tables weight * wtounce /list missing;
tables name * name1 * name2 * name3 /list missing;
title1 'The height/weight data set';
title2 'Check new variables against original variables';
run;
Note that some statements can be combined;
for example, the ROUND function can be used in the same
statement that calculates a new variable. The statement
htround = round((height/12),1); will divide
height by 12, at the same time rounding the result to a whole
number.
|