The MCHP SAS MANUAL - Explore the Data (numeric)

Home    Contents

GENERAL GUIDELINES:
Windows in SAS
File management

The SAS Program
Program syntax
Debugging tips

``` USING SAS PROGRAMMING TO:
```
```1. Prepare the data set
Types of data
Example programs

2. View the data
SAS Procedures

3. Explore the data
Numeric statistics
Frequency tables

4. Manipulate the data
Basic techniques
New variables

Observations to Data Sets
The SET Statement
The MERGE Statement

6. Data Processing
ARRAY Statement
Do Loops
By-Group Processing
RETAIN Statement```
```
NON-PROGRAMMING
Alternatives```

```
SAMPLE DATA SETS:
Height/weight
Height/weight/region
Simulated clinical data
Simulated Manitoba Health
```

III. EXPLORE THE DATA: STATISTICS FOR NUMERIC DATA

Certain SAS procedures can only be performed on numeric data. Two such procedures - PROC MEANS and PROC UNIVARIATE - are illustrated here using the height/weight SAS data set. (Note that PROC SUMMARY generates output similar to PROC MEANS.)

1. PROC MEANS

 ```PROC MEANS: Example 1 ***************************************************** *This program creates output (Example 1) * *using the default setting of PROC MEANS. * *****************************************************; proc means data=htwt;/* Begin the PROC step */ /* Add 2 titles */ title1 'PROC MEANS: Example 1'; title2 'No keywords specified'; run; /* End the PROC step */ ```

 ```PROC MEANS: Example 2 ***************************************************** *This program specifies a series of keywords and * *optional statements to create output (Example 2) * *using PROC MEANS. The CLASS statement avoids having* *to sort the data first, but the CLASS statement is * *more suited to smaller data sets or when just a few* *CLASS variables are to be used. * *****************************************************; /*Some of the keywords available with PROC MEANS: N - number of observations MEAN - mean value MIN - minimum value MAX - maximum value SUM - total of values NMISS - number of missing values MAXDEC=n - set maximum number of decimal places */ proc means data=htwt n mean min max sum nmiss maxdec=1; /*Apply analysis only to "age" variable*/ var age; /*Separate the analysis by values of sex*/ class sex; /* Add 3 titles */ title1 'PROC MEANS: Example 2'; title2 'Use of VAR, CLASS, and TITLE statements'; title3 'CLASSED by gender'; run; ```

 ```PROC MEANS: Example 3 ***************************************************** *This program generates output (Example 3) * *similar to Example 2 but displays the output * *slightly differently and also creates another SAS * *data set. Additional resources are used because * *the data must be sorted first. * *****************************************************; /* Sort the data first because a BY statement is being used in the next PROC step */ /*Sort by sex */ proc sort data=htwt; by sex; run; proc means data=htwt n mean min max sum nmiss maxdec=1; /*Separate the output by sex*/ var age; by sex; /*Create a temporary SAS data set containing the information generated by PROC MEANS */ output out=agedata; /* Add 3 titles */ title1 'PROC MEANS: Example 3'; title2 'Use of VAR, BY and OUTPUT statements'; title3 'SORTED by gender'; run; /*Display values of the new data set*/ proc print data=agedata; /* Add a 4th title*/ title4 'A print of the OUTPUT data set'; run; /*Remove Titles 2-4 from the next set of output*/ title2; title3; title4; ```

PROC UNIVARIATE provides additional statistics, some of which are not available from PROC MEANS (e.g. mode).

 ```PROC UNIVARIATE: Example ****************************************************** *This program uses PROC UNIVARIATE to create * *detailed output of numeric statistics * *(Univariate example) on the "age" variable. * ******************************************************; proc univariate data=htwt; var age; /*Apply analysis only to "age" variable*/ /* Add 3 titles */ title1 'PROC UNIVARIATE example'; run; ```

These questions assume that a permanent SAS data set has been created from the sample clinical data. The format file does not need to be included for this section. Examples are given for how program, log, and output might look.

1. Generate numeric statistics using the default setting for PROC MEANS.

2. Obtain the mean values for heart rate and systolic and diastolic blood pressure, limiting the decimal places to 2, and indicating how many missing values there may be.

3. Re-submit the question, this time obtaining the mean values for the 3 variables for each gender and for whether or not the patient is pregnant. Save these values to a separate data set, and display a listing of these values. (3 procedures)

4. Obtain mean, median, and mode values for systolic and diastolic blood pressure.

 Home II. View the Data NEXT IIIb. Data Exploration: Frequency Tables

Contact: Charles Burchill       Telephone: (204) 789-3429
Manitoba Centre for Health Policy
Department of Community Health Sciences, University of Manitoba
4th floor Brodie Centre
408 - 727 McDermot Avenue
Winnipeg, Manitoba R3E 3P5       Fax: (204) 789-3910