| Four SAS procedures are described 
            here. Two SAS procedures - CONTENTS and PRINT - are frequently used 
            to take a first look at the data. Two other procedures - PROC FORMAT 
            and PROC SORT - can be used with them to enhance the output, the former 
            for labeling or grouping data values, and the latter to change the 
            order in which the records are sorted. Except for PROC CONTENTS, all 
            examples assume that a temporary SAS data set has been created from 
            the height/weight data. 
1. PROC CONTENTS
 PROC CONTENTS can be used to obtain general information about a SAS
        data set, including an alphabetic list of variables and their attributes
        (e.g. type, length). Details are also provided regarding the data set 
        itself, such as number of observations and number of variables, and
        whether the data set was sorted by any variable(s) or compressed. 
 
 
| 
*****************************************************
*This program was used on the simulated Manitoba    *
*Health data, both for Version 1 and Version 2      *
*(the latter showing the output with labels added to*
*both variables and values)                         *
****************************************************;
proc contents data=test;
run;
 |  
2. PROC PRINT
 
PROC PRINT can be used to display the values for any of the variables
        and for any number of observations in the SAS data set.  Five examples
        of PROC PRINT, using the height/weight data set, are shown here, 
        the latter three being illustrated with the use of PROC SORT.  
 
 
| 
Example 1:  PROC PRINT
*****************************************************
*This program creates a listing                     *
*of all the values and all the variables.           *
*****************************************************;
 
proc print data=htwt;      /* Begin the PROC step */
                        /* Add 2 titles */
  title1 'PROC PRINT: Example 1';
  title2 'No keywords specified except for TITLE';
run;             /* End the PROC step */
 |  
 
| 
Example 2: PROC PRINT
*****************************************************
*This program produces output that illustrates      *
*the use of a number of optional keywords and       *
*statements that can be used with PROC PRINT.       *
*****************************************************; 
/* Display the first 10 records (this requires the data= 
   option).  The LABEL keyword is necessary for the LABEL
   statement below */
proc print data=htwt (obs=10) label;
/* Instead of numbering the records sequentially,
   identify them by the values of the name variable */
  id name;   
/* Only print the data values for two 
   variables (age and sex) */
  var sex age;
 
/* Add up the values for the weight variable */
  sum weight;
 
/* Add labels for 4 variables */
  label name   = 'Name of student'
        weight = 'Weight in pounds'
        sex    = 'Gender of student'
        age    = 'Age of student';
 /* Instead of displaying sex with values of M and F
    use the format $sexl (previously created) and the 
    format statement to label them as Male and Female */
  format sex $sexl.;
     /* Add 2 titles */
  title1 'PROC PRINT:  Example 2';
  title2 'Use of OBS=, LABEL, ID, VAR,
          SUM, and FORMAT keywords';
run;
 |          
3. PROC SORT
 
PROC SORT is used to sort a data set on specified variables.  PROC PRINT is
used here to illustrate the results of different ways of using PROC SORT (PROC
SORT by itself does not produce any output in the Output window).  
It is important to note that sort order sequence (i.e., whether numbers or 
alphabetic characters are sorted first) and how missing values are dealt with
can vary with the  operating system.  In PC SAS, numeric values are ordered
before alphabetic values.
 
 
| 
Example 3: PROC PRINT AND PROC SORT
******************************************************
*This program sorts the data by name and creates a   *
*listing of the values of 3 variables (name being    *
*placed in the first column)for the first 10 records.*
*The resulting output is displayed                   *
*in alphabetical order of name.                      *
******************************************************;
proc sort data=htwt;
  by name; 
run;
proc print data=htwt (obs=10);  
  id name; 
  var sex age; 
  title1 'PROC PRINT: Example 3';
  title2 'Where the data set is sorted by name';
run;             
 |  
 
|  |  
| 
Example 4: PROC PRINT AND PROC SORT
******************************************************
*This program sorts the data in reverse order of name*
*and creates a listing of the values of 3 variables  *
*(name being placed in the first column) for the     *
*first 10 records.  This output is displayed         *
*in reverse alphabetical order of name.              *
******************************************************; 
proc sort data=htwt;
  by descending name; 
run;
proc print data=htwt (obs=10);  
  id name; 
  var sex age; 
  title1 'PROC PRINT: Example 3';
  title2 'Where the data set is sorted by DESCENDING name';
run; 
 |  
|  |  
 
|  |  
| 
Example 5: PROC PRINT AND PROC SORT
******************************************************
*This program creates another data set called "other"*
*which is sorted by sex and, for each value of sex, *
*is sorted by age.  The PROC PRINT step is identical*
*to Example 4 except the newly created data set is  *
*specified to produce output instead of 
*the "htwt" data set.                               *
*****************************************************;  
proc sort data=htwt out=other;
  by sex age; 
run;
proc print data=other (obs=10);  
  id name; 
  var sex age; 
  title1 'PROC PRINT: Example 5';
  title2 'Where the data set is sorted by sex and age';
run; 
 |  
4. PROC FORMAT 
              PROC FORMAT is an extremely useful SAS procedure for creating 
              formats that can be used to label data values or to group them. 
              The PROC FORMAT statement is usually placed prior to a DATA step 
              (although it can be run separately, creating formats that can be 
              used at any time during the SAS session). Separate VALUE statements 
              are required for each format; multiple VALUE statements can be specified 
              under one PROC FORMAT statement. A data set is not specified when 
              using a PROC FORMAT statement. PROC FORMAT does not change, manipulate 
              or do any calculations on the data. It simply creates formats which 
              the user can use in PROC or DATA steps after PROC FORMAT has run. 
              Format names are assigned by the user; they must be no longer 
              than 32 characters and cannot end in a number (In older versions 
              of SAS, format names can only be 8 characters long). Formats that 
              will be used with character variables MUST start with "$" 
              (the "$" counts as one of the 32 allowed characters). 
              The format name can also be used to distinguish grouping formats 
              (e.g., ending in "F" or "G") from labeling formats (e.g., ending 
              in "L"). Another useful convention is to repeat the original value 
              in the new label being created (e.g.,. 'A' = 'A.Winnipeg' 
              instead of 'A'='Winnipeg'). The output could then display 
              not only the label for the value, but the original value as well. 
             
Once PROC FORMAT is submitted, only the log indicates
that the program has executed; it should show the names of the
formats that have been created.  The log will add an additional note
indicating that the format "is already on the library" if the format
already exists (e.g., was previously submitted), and indicating that 
the previously existing format has been overwritten. This is not a problem
unless the user wishes to keep the pre-existing format as well - in that
case, the new format should be given a new name before submission (and before
SAS overwrites the pre-existing format).
              No output is produced in the Output window when submitting PROC 
              FORMAT. The formats, however, are now available for use at anytime 
              during the current SAS session, and can be used for labeling values 
              (using the FORMAT statement) 
              or for creating new variables by grouping values using the e.g., 
              PUT statement. 
             
 
| 
*****************************************************
*This program creates several formats.              *
*All values on the left side of "=" refer to values *
*that must already exist in the data set.  All      *
*values on the right side are created by the user.  *
*The keywords LOW, HIGH, and OTHER are illustrated. *
*****************************************************;
 
proc format;     
/*1.Create format to be used to label CHARACTER values*/
 /* Create $SEXL format (need $ and quotes)*/
  value $sexl  
         'M' = 'M.Male'
         'F' = 'F.Female';
/*2.Create format to be used to label NUMERIC values */
  value sexl
           1 = '1.Male'
           2 = '2.Female';
 
/*3.Create format to be used to group CHARACTER values */
               /* Group values of A and B into value 1*/
  value $regionf 'A','B' = '1' 
               /* Group values C to E into value 2 */
                 'C'-'E' = '2' 
               /* Group all other values into value 3 */
                 Other   = '3' ;     
/*4.Create format to be used to group NUMERIC values */ 
  value agef                
  /*Note that missing values would be included in 
       the <30 category. 0-29 could be specified instead
       of low-29 to exclude the missing values from
       the grouping. */
 
            low-29 = '1'
             30-39  = '2'
             40-49  = '3'
           50-high  = '4';
run;
 |  
VIEW THE DATA: PRACTICE EXERCISES
              These questions assume that a permanent SAS data set has been created from 
              the sample clinical data. Examples 
              are given for how program, 
              log, and output 
              might look. 
             
  Generate a list of variables and their attributes.
  
  Generate the following listings of variable values:
      
                   All variables for all observations in the data, displaying 
                    their original values. 
                  The first 5 observations, printing values for the following 
                    3 variables: gender, diastolic blood pressure, and systolic 
                    blood pressure. Display labels for the variable names in the 
                    output, and add value labels for the gender variable. 
                  Re-run the same program on all observations, except this 
                    time display the data for the 3 variables sorted by gender. 
                    (2 procedures required.) If the original sort order is desired 
                    to be kept in the clinical data, the user has the option of 
                    creating an output data set, 
                    sorted by gender, with a different name. 
                  Re-run the same program, except this time sort the data 
                    by both gender and systolic blood pressure, and display gender 
                    in the first column (rather than having the observation number 
                    showing). (2 procedures required.) 
                 
   Change how output is displayed for the gender variable and display a 
       listing for only this variable.
       Instead of displaying Male and Female, have the values read
       Male adult and Female adult.  (2 procedures required.)
      
  
 
 
                |  Home 
  Ib. Data Preparation: Example Programs | NEXT 
  IIIa. Data Exploration: Numeric Statistics | 
 
 
 |