STATISTICS 210
A review of the statistical principles of data analysis using
computerized statistical analysis procedures provided by the Statistical
Statistical Analysis System (SAS). Statistical methods reviewed and
applied include graphical displays (density estimation), univariate
analyses, multiple regression, collinearity diagnostics, influence
diagnostics, data-dependent model biases, analysis of contingency
tables and categorical data, logistic regression for qualitative
responses, analysis of variance and covariance, and the general linear
model. Each week a statistical method is reviewed and sample analyses
presented in SAS listings. Each week a data analysis project is
assigned requesting that specific statistical analyses be performed and
that the results be presented and interpreted in a typed statistical
report. Each student is also required to complete an independent data
analysis project.
Prerequisites: 1) Stat 118, and 2) either Stat 157 or 201, and 3) Stat
183 or equivalent or proficiency with SAS.
- Course Materials, 2003
- Programs and data
Lecture 1 Listings
Lecture 2 Listings
Below are two sas programs and corresponding sets of log and
listing outputs, one for "st210a_d.sas" the
other for "st210a_u.sas". The latter uses the qqplot.sas macro
that is included
in the section below. The sas files are simple text files. There
is a log and a lst file for each. Two versions of each are
provided. One is a text file with a name such as "sa_dlog txt" for
st210a_d.log, the other is a pdf file with a name "sa_dlog pdf".
If you have adobe I recommend that you open and print the pdf file,
especially the lst file.
Otherwise you can open the txt files and try to print. You may have to
highlight the document and change the font size to get it to print
properly. Use print preview before actually printing.
Lecture 3 Listings
Lecture 4 Listings
Below is the sas file, saslog and listing files for the lecture 4.
The pdf version of the listing file (a_r4lst.pdf)is about 85 pages
with page breaks and is nicely
formatted. The text version(a_rlst.txt) is shorter but is one
continuous stream with no page breaks. The txt file
can be reduced further by opening in ms
word or word pad, hitting ctl-a to highlight all and then change the
font to a smaller size.
If you use the txt file, be sure to use the print preview before
printing to makesure that each line of output is not split between
two lines (check the first line of each page, the page number should be
on the far right). If the lines are split you should either use smaller
margins or a smaller font size.
Lecture 5 Listings
Below are the sas files,
saslog and listing files for lecture 5. There
are two sets of files.
The current versions of these
files were uploaded on 2/23/04 the day of the lecture. They include
a new macro "resplot" that generates the variance plots within
deciles of the predicted values. The sas code used previously had
errors.
Because the file names are the
same as before, your browser may have cached the old files when you
click on the file name. After clicking on the file, when it opens,
be sure to click the reload button to load the most current version
of the file.
If you already downloaded these listings, the only thing that is
different are the table of decile means and variances, and the
corresponding plots.
A_R5 is a continuation of the A_R4 listing that was
placed on the website for lecture 4.
A_M is a new listing for a multiple regression analysis. this will
be used in lecture 5 and lecture 6
Lecture 6 Listings
The following is an unpublished manuscript on identifying synnergism
and antagonism in regression models.
Please bring the tables to class since I will use them at the
beginning of lecture 6. You can read the paper later if you wish.
A_c is a new listing for a use of dummy variables for a categorical
variable in a regression model
Lecture 7 Listings
The following listings for A_I.sas will be used in lecture 7 to
describe collinearity and influence diagnostics for a regression
model.
Lecture 8 Listings
The following listings for stat21B.sas will be used in lecture 8 to
describe stepwise regression and cross validation.
The following listings for sH_a.sas will be used to provide an
introduction to logistic regression models.
Lecture 9 Listings
The following listings describe value added plots in logistic
regression, stepwise models and cross validation.
The following listings describe binomial regression and interaction
models.
Lecture 10 Listings
The following listings describe analysis of variance.
Lecture 11 Listings
The following listings describe value unbalanced analysis of
means (unbalanced ANOVA)
Here is exercise 9 and the sas file. You will have to change the
data set path before running the sas program.
Lecture 12 Listings
The following listings describe analysis of co-variance. The pdf
file
is 54 pages long and took 10 minutes to upload. The txt file can be
downloaded much quicker.
Lecture 14 Listings
The following listings describe analyses of repeated measures. The
pdf file is very
long and took 10 minutes to upload. The lst file (not an ordinarytext
file) can be
downloaded much quicker. These files were generated using the BSC
mainframe computer and thus may appear differently from other files
generated using PC sas. For the lst file you should download and then
open using wordpad or word.
Programs and data
|