The MCHP SAS MANUAL - Data Processing (BY-Group Processing - First./Last.)

         

Home    Contents

GENERAL GUIDELINES:
Windows in SAS
File management

The SAS Program
Program syntax
Debugging tips


 USING SAS PROGRAMMING TO: 
   
1. Prepare the data set 
   Types of data 
   Example programs    
    
2. View the data
   SAS Procedures
  
3. Explore the data  
   Numeric statistics    
   Frequency tables    
    
4. Manipulate the data  
   Basic techniques    
   New variables
  
5. Adding Variables and 
Observations to Data Sets
   The SET Statement
   The MERGE Statement

6. Data Processing
   ARRAY Statement
   Do Loops
   By-Group Processing
   RETAIN Statement
  
NON-PROGRAMMING 
      Alternatives

 
SAMPLE DATA SETS: 
 Height/weight
 Height/weight/region
 Simulated clinical data 
 Simulated Manitoba Health 
    

VI. DATA PROCESSING: BY-GROUP PROCESSING (FIRST./LAST.)

Purpose

By-group processing refers to the use of a BY statement in a DATA step, which permits identification of the first- and last-occurring record for each of the specified BY variables. Two dichotomous (1/0) variables are automatically created for each variable specified in the BY statement when using SET, MERGE, or UPDATE: FIRST.varname and LAST.varname, where varname is the name of the BY variable(s). By creating these variables, a number of various calculations are possible, such as obtaining a count of records for each unique identifier.

Syntax

BY varname1 varname2...;

For the first record in a BY group, the value of the FIRST.varname1 is set to 1, with all other records in the BY group set to 0. For the last record in a BY group, the value of the LAST.varname1 is set to 1, with all other records set to 0. If the data are sorted by more than one BY variable, the FIRST.varname for each variable is set to 1 at the first occurrence of a new value for the variable. FIRST. and LAST. variables are temporary variables that are only available for the current data step. You can create permanent variables equal to the temporary FIRST. and LAST. variables (firstvar=FIRST.var;). These permanent variables will be available in subsequent PROC and DATA steps.

Example

PROC SORT data=hosp;
     BY phin;
RUN;

DATA dup;
     SET hosp;
     BY phin;                           (1)
     firstfl=FIRST.phin;                (2)
     lastfl=LAST.phin;                  (3)
RUN;

(1) Set the data by PHIN (already previously sorted by this variable) in order to create FIRST.PHIN and LAST.PHIN.

(2) Create a new variable called FIRSTFL and assign it a value of 1 for every FIRST.PHIN=1 encountered.

(3) Create a new variable called LASTFL and assign it a value of 1 for every LAST.PHIN=1 encountered.

Output:

OBS   PHIN   FIRSTFL   LASTFL
1      562737          1                1
2      563850          1                1
3      563961          1                1
4      565858          1                1
5      566739          1                1
6      568729          1                0
7      568729          0                0
8      568729          0                1
9      569961          1                1 
10    660861          1                1

In the above example, the person with PHIN 568729 has 3 record (observations 6-8). For the first record (#6), FIRSTFL is set to 1, indicating that it is the first record for that person and all other records for that PHIN show FIRSTFL values set to 0. For the third and last record, LASTFL is set to 1, indicating that it is the last record for that person and all other records show LASTFL values set to 0.

Caution: When conducting BY-group processing, DO NOT do any data exclusions; data manipulation is ok. Data exclusions can be done in a subsequent data step. Data exclusions conducted during a data step with FIRST. and LAST. processing can cause unexpected results by eliminating the FIRST. and LAST. records for each BY-group. The only time data exclusions can be done with BY-group processing is with a subsetting WHERE statement, which is applied to the data set coming in, before any BY-group processing is carried out.

 
Home
VIb. Data Processing: Do Loops
NEXT
VId. Data Processing: RETAIN Statement

 

Contact: Charles Burchill       Telephone: (204) 789-3429
Manitoba Centre for Health Policy
Department of Community Health Sciences, University of Manitoba
4th floor Brodie Centre
408 - 727 McDermot Avenue
Winnipeg, Manitoba R3E 3P5       Fax: (204) 789-3910
Last modified on Wednesday, 14-Sep-2005 08:23:07 CDT