| Purpose Arrays are often used in conjunction with DO 
              loops when performing actions for a series of variables. The 
              following example illustrates the same action being performed on 
              two separate diagnostic field variables. The study diagnosis of 
              820.0 can occur in either of these fields, and the statements are 
              identical except for the name of the diagnostic field. The intent 
              of the following statements is to flag all occurrences of the study 
              diagnosis by creating a new variable - "HIPFRAC" - where 
              '1' indicates the presence of the desired diagnosis. If '82000'<=DX01<='82009' then HIPFRAC='1';
If '82000'<=DX02<='82009' then HIPFRAC='1'; Sixteen diagnostic fields (DX01-DX16), however would require 16 
              lines of code.  Array processing can make the program more efficient by streamlining 
            the code required to accomplish the task (depending on the situation, 
            if-then/else statements can be faster; however, they are also more 
            error-prone). A specified series of variables is associated with a 
            collective name of your choice; for example, the diagnostic fields 
            DX01 through DX16 could be associated with the name "DIAG", 
            which will then operate similarly to variables in data step manipulations. 
           Syntax  Arrays are set up using an ARRAY statement. It can appear anywhere 
              in the DATA step as long as it occurs prior to any reference to 
              it. The variables that make up the array are called elements. Individual 
              elements are identified by subscripts (numbers that identifies an 
              element's position in the array).  
              ARRAY 
                array-name {number of variables} variable-1, variable-2...variable-n; Array-name 
              is a name you choose to represent the group of variables (must be 
              32 characters or fewer beginning with a letter or underscore). Number of 
              variables tells SAS how many variables are being grouped; it 
              is represented by subscripts that are enclosed in brackets. Variable-1, 
              variable-2,...variable-n lists the names of the variables (the 
              variable list does not have to begin at 1 - e.g., DX5-DX16). Example  
              ARRAY diag{16} $ dx01-dx16; This statement tells SAS to : 
              create a group or array name DIAG for the duration of the DATA 
                step.have DIAG represent 16 variables: diagnostic fields DX01 through 
                DX16 Note that DX01-DX16 
              are character variables and thus must be preceded by a "$". You can refer 
              to the entire array or just one of its elements when performing 
              logical comparisons or arithmetic calculations. All variables listed 
              in the ARRAY statement are assigned extra names with the form array-name{position}, 
              where position is the position of the variable in the list (1,2,3,...,16 
              in the example). The additional name is called an array reference 
              and the position is often called the subscript.  In the above 
              ARRAY statement, DX01 is assigned the array reference DIAG{1}; DX02 
              the array reference DIAG{2}; etc. From that point in the data step, 
              you can refer to the variable by either its original name or by 
              its array reference; for example, the names DX01 and DIAG{1} are 
              equivalent. Caution: 
              An array is simply a convenient way of temporarily identifying a 
              group of variables; it exists only for the duration of the DATA 
              step. Arrays are not variables. 
               
                |  Home 
  Vb. Adding Variables and Observations to Data Sets: The MERGE 
                  statement | NEXT 
  VIb. Data Processing: Do Loops |   
              
         |