SIMPLE STATISTICAL PROCEDURES AND GRAPHS
CONTINUOUS VARIABLES: PROC MEANS
PROC MEANS DATA=your.data maxdec=2;
VAR cont _var;
CLASS cat_var;
RUN;
- Options for PROC MEANS
- VAR - to select variables
- CLASS – to group observations
- MAXDEC – limits # decimal places
CATEGORICAL VARIABLES: PROC FREQ
PROC FREQ DATA=your.data
TABLES cat _var1 cat _var1 cat _var2 / MEASURES
RUN;
- Frequency counts and crosstabs
- Options for PROC FREQ
- TABLES to select variables
- TABLES x * y to do cross tabs
- / MEASURES under PROC FREQ for a 2x2 table will give you the Odds Ratio along with its confidence limits
- / NOCUM – suppresses cumulative frequencies and percentages
CATEGORICAL AND NUMERIC: PROC TABULATE
TABLE statement for PROC TABULATE
- Has its own unique syntax and operators
- Comma – go to new table dimension
- Blank – concatenate table information
- Asterisk – cross, nest, subgroup information
using a space in PROC TABULATE to calculate counts and totals
title ‘counts by location’;
proc tabulate data=your.data;
class location;
table location all;
run;
- space between “location” and “all” appends results
- Key word ‘all’ returns total
- Only specifying a class variable returns a count
using a comma in PROC TABULATE to create a new dimension
title1 ‘counts by location’;
title2 ‘by job;
proc tabulate data=your.data;
class location jobcode;
table jobcode all , location all;
run;
- comma between jobcode and location creates new dimension
- ‘all’ preceded by space creates totals
PROC TABULATE with a subsetting WHERE Statement
proc tabulate data=your.data;
where location in (‘zipcode1’ , zipcode2’);
class location dx;
table dx all , location all;
run;
- returns a table of locations by diagnoses with total counts in margins
The VAR statement in PROC TABULATE
proc tabulate data=your.data.;
where location in (‘zipcode1’ , ‘zipcode2’);
class location jobcode;
var income;
table jobcode, location income;
run;
- VAR specifies a continuous variable
- here, rather than return counts (all) return statistics on the income variable for cross categories of location and income
- default statistic is the sum
- if you want a different statistic in the cell, follow the VAR analysis variable with an asterisk and the desired statistic
- e.g. table jobcode, location income mean;
- available statistics include median, min, max, std (standard deviation)
SAS GRAPHICS
- flexibility to produce charts in jpeg, gif, etc…
- Save your graphics as .emf files for Word documents
- SAS shows charts in a separate graph window
- need to quit out of graph procedures
- running other procs will exit you from gchart, or can type quit;
Histograms with PROC GCHART
proc gchart data=your.data;
vbar jobcode / sumvar=salary type=mean;
run;
- VBAR to get vertical bars, HBAR to get horizontal bars
- PIE – request pie chart
- FILL= x for cross-hatch
- EXPODE = to pull slice away
- TYPE= - specifies that the height or length of the bar
- SUMVAR – id variable for simple analyses
Plot Points with PROC GPLOT
proc gplot data=
plot var1 * var2
run;
- SYMBOL statement
- VALUE= STAR, DIAMOND, SQUARE, TRIANGE
- I= JOIN, SPLINE
- C= W= color and width
- RESET= to return to default (SYMBOL global and additive)
Graphic Odds and Ends
- SAS keeps on appending graphs in the graph window
- to get rid of old graphs:
- open ‘G seg’ folder in the Explorer Window (found in the work library)
- delete the graphs you don’t want
- can also delete the entire Gseg folder to get rid of all the graphs.