The MCHP SAS MANUAL - Data Manipulation (Program 2 New Variables)

         

Home    Contents

GENERAL GUIDELINES:
Windows in SAS
File management

The SAS Program
Program syntax
Debugging tips


 USING SAS PROGRAMMING TO: 
   
1. Prepare the data set 
   Types of data 
   Example programs    
    
2. View the data
   SAS Procedures
  
3. Explore the data  
   Numeric statistics    
   Frequency tables    
    
4. Manipulate the data  
   Basic techniques    
   New variables
  
5. Adding Variables and 
Observations to Data Sets
   The SET Statement
   The MERGE Statement

6. Data Processing
   ARRAY Statement
   Do Loops
   By-Group Processing
   RETAIN Statement
  
NON-PROGRAMMING 
      Alternatives

 
SAMPLE DATA SETS: 
 Height/weight
 Height/weight/region
 Simulated clinical data 
 Simulated Manitoba Health 
    

IV. DATA MANIPULATION: CREATE NEW VARIABLES


Program 2: Arithmetic operators and SUBSTR function

The following program illustrates another two ways of using assignment statements.

New variables can be created using arithmetic operators to perform calculations on existing variables. Because calculations are involved, this type of assignment statement can only be used on numeric variables.

New variables can be created using the SUBSTR, or substring, function to truncate values of existing variables. A practical application of this function is its use with the ICD-9-CM diagnosis in Manitoba Health data, which can range from 3 to 5 digits in length. The 4- and 5-digit values are collapsible to 3 digits (e.g., acute myocardial infarction is ICD-9-CM 410, with a number of sub-categories, e.g., 410.72, which represents subendocardial infarction, subsequent episode). If the labels are only available at the 3-digit level, a new diagnosis variable can be created that reads in only the relevant 3 digits of the diagnosis field.

The log and output are also available for the following program, which assumes that the htwt data set has already been created.

*************************************************
* file = assign.sas                             *
* The SAS program in this file creates new      *
* variables using assignment statements:        *
* several arithmetic operators and the          *
* SUBSTR function.                              *
*************************************************;

options linesize=min;

*----------------------------------------*
* Create a new temporary SAS data set,   *
* same name, to add new variables        *
*----------------------------------------*;

data htwt;
  set htwt;

*------------------------------------------------*
* 1. Create new variables that incorporate       *
*    calculations made on the existing variables *
*------------------------------------------------*;

          /* Add 2 to "height" */
ht2p    = (height + 2);

           /* Subtract 1 from "age" */
agem1   = (age - 1);    

           /* Convert inches to feet */
htfeet  = (height/12);  

           /* Use the ROUND function to
                limit decimals to 1 */
htround = round(htfeet,0.1);

            /* Convert pounds to ounces*/
wtounce = (weight*16);

*------------------------------------------------*
* 2. Create new variables that truncate the      *
*    values of existing variables.               *
*------------------------------------------------*;
                         /* Starting at first value,
                            display first 2 values*/
name1 = substr(name,1,2);

                         /* Starting at first value,
                            display first 3 values*/
name2 = substr(name,1,3);

                         /* Starting at second value,
                            display first 3 values */
name3 = substr(name,2,3);

run;

proc freq data=htwt;
  tables height * ht2p * htfeet * htround /list missing;
  tables age * agem1 /list missing;
  tables weight * wtounce /list missing;
  tables name * name1 * name2 * name3 /list missing;
title1 'The height/weight data set';
title2 'Check new variables against original variables';
run;
Note that some statements can be combined; for example, the ROUND function can be used in the same statement that calculates a new variable. The statement htround = round((height/12),1); will divide height by 12, at the same time rounding the result to a whole number.

Contact: Charles Burchill       Telephone: (204) 789-3429
Manitoba Centre for Health Policy
Department of Community Health Sciences, University of Manitoba
4th floor Brodie Centre
408 - 727 McDermot Avenue
Winnipeg, Manitoba R3E 3P5       Fax: (204) 789-3910
Last modified on Wednesday, 24-Aug-2005 13:30:47 CDT