Home
Contents
GENERAL GUIDELINES:
Windows in SAS
File management
The SAS Program
Program syntax
Debugging tips
USING SAS PROGRAMMING TO:
1. Prepare the data set
Types of data
Example programs
2. View the data
SAS Procedures
3. Explore the data
Numeric statistics
Frequency tables
4. Manipulate the data
Basic techniques
New variables
5. Adding Variables and
Observations to Data Sets
The SET Statement
The MERGE Statement
6. Data Processing
ARRAY Statement
Do Loops
By-Group Processing
RETAIN Statement
NON-PROGRAMMING
Alternatives
SAMPLE DATA SETS:
Height/weight
Height/weight/region
Simulated clinical data
Simulated Manitoba Health
|
III. EXPLORE THE DATA: STATISTICS FOR NUMERIC DATA
Certain SAS procedures can only be performed on numeric data. Two such procedures
- PROC MEANS and PROC UNIVARIATE - are illustrated here using the height/weight
SAS data set. (Note that PROC SUMMARY generates output similar to
PROC MEANS.)
1. PROC MEANS
PROC MEANS: Example 1
*****************************************************
*This program creates output (Example 1) *
*using the default setting of PROC MEANS. *
*****************************************************;
proc means data=htwt;/* Begin the PROC step */
/* Add 2 titles */
title1 'PROC MEANS: Example 1';
title2 'No keywords specified';
run; /* End the PROC step */
|
PROC MEANS: Example 2
*****************************************************
*This program specifies a series of keywords and *
*optional statements to create output (Example 2) *
*using PROC MEANS. The CLASS statement avoids having*
*to sort the data first, but the CLASS statement is *
*more suited to smaller data sets or when just a few*
*CLASS variables are to be used. *
*****************************************************;
/*Some of the keywords available with PROC MEANS:
N - number of observations
MEAN - mean value
MIN - minimum value
MAX - maximum value
SUM - total of values
NMISS - number of missing values
MAXDEC=n - set maximum number of
decimal places */
proc means data=htwt n
mean min max sum nmiss maxdec=1;
/*Apply analysis only to "age" variable*/
var age;
/*Separate the analysis by values of sex*/
class sex;
/* Add 3 titles */
title1 'PROC MEANS: Example 2';
title2 'Use of VAR, CLASS, and TITLE statements';
title3 'CLASSED by gender';
run;
|
PROC MEANS: Example 3
*****************************************************
*This program generates output (Example 3) *
*similar to Example 2 but displays the output *
*slightly differently and also creates another SAS *
*data set. Additional resources are used because *
*the data must be sorted first. *
*****************************************************;
/* Sort the data first because a BY statement
is being used in the next PROC step */
/*Sort by sex */
proc sort data=htwt;
by sex;
run;
proc means data=htwt n
mean min max sum nmiss maxdec=1;
/*Separate the output by sex*/
var age;
by sex;
/*Create a temporary SAS data set containing
the information generated by PROC MEANS */
output out=agedata;
/* Add 3 titles */
title1 'PROC MEANS: Example 3';
title2 'Use of VAR, BY and OUTPUT statements';
title3 'SORTED by gender';
run;
/*Display values of the new data set*/
proc print data=agedata;
/* Add a 4th title*/
title4 'A print of the OUTPUT data set';
run;
/*Remove Titles 2-4 from the next set of output*/
title2;
title3;
title4;
|
2. PROC UNIVARIATE
PROC UNIVARIATE provides additional statistics, some of which
are not available from PROC MEANS (e.g. mode).
PROC UNIVARIATE: Example
******************************************************
*This program uses PROC UNIVARIATE to create *
*detailed output of numeric statistics *
*(Univariate example) on the "age" variable. *
******************************************************;
proc univariate data=htwt;
var age; /*Apply analysis only to "age" variable*/
/* Add 3 titles */
title1 'PROC UNIVARIATE example';
run;
|
EXPLORE THE DATA - PRACTICE QUESTIONS (numeric)
These questions assume that a permanent SAS data set has been created from
the sample clinical data. The format
file does not need to be included for this section. Examples are
given for how program, log,
and output might look.
- Generate numeric statistics using the default setting for PROC MEANS.
- Obtain the mean values for heart rate and systolic and diastolic blood
pressure, limiting the decimal places to 2, and indicating how many missing
values there may be.
- Re-submit the question, this time obtaining the mean values for the 3
variables for each gender and for whether or not the patient is pregnant.
Save these values to a separate data set, and display a listing of these
values. (3 procedures)
- Obtain mean, median, and mode values for systolic and
diastolic blood pressure.
Home
II. View the Data |
NEXT
IIIb. Data Exploration: Frequency Tables |
|
Contact: Charles Burchill
Telephone: (204) 789-3429
Manitoba Centre
for Health Policy
Department of Community Health Sciences,
University of Manitoba
4th floor Brodie Centre
408 - 727 McDermot Avenue
Winnipeg, Manitoba
R3E 3P5
Fax: (204) 789-3910
|