Ron D. Hays, Ph.D.
The world's most trusted source of
health policy research
NIH Biosketch (posted 10/3/13)
Item Response Theory
Other Psychometric Issues
Hays RD. (1987). PARALLEL: A program for performing parallel analysis. Applied
Psychological Measurement. 11, 58.
The executable: PARALLEL.EXE .
Hays RD, Wang E, & Sonksen M. (1995, September).
General Reliability and
Intraclass Correlation Program (GRIP). Proceedings of the 3rd Annual Conference of Western
Users of SAS Software. (article uploaded 1/2/13)
6/2/2003: replaced with a version modified by Sally Carson and Karen Spritzer -- adds
NREL70, NREL80, NREL90, and NREL95 plus now able to run under SAS 8.0.
2/18/04: updated to fix a problem that came about because SAS 8 is case sensitive with
respect to variable names.
The macro and sample call: grip_feb18-04.sas
Example of 1-way GRIP.
Liu H, & Hays RD. (1999, April). Measurement of interrater agreement: A SAS/IML
macro kappa procedure for handling incomplete data. Proceedings of the SAS Users Group
International Conference, 1620-1625.
Here's the SAS macro - same version in text and WORD:
wkappa.txt (text) wkappa.doc
Link to the paper: http://www2.sas.com/proceedings/sugi24/Stats/p280-24.pdf
Note: this macro was written using SAS/IML version 6 (may need to be
modified a little when running in SAS version 8 and above).
Note: wkappa.txt and wkappa.doc were updated on 5/28/2010 - the only change is a more
Other helpful programs
Scoring the SF-36 version 1.0 (this
version uses 1990 General Population norms - see alternative code to use 1998 norms):
Hays RD, Sherbourne CD, Spritzer KL, & Dixon W J. (1996) A Microcomputer Program (sf36.exe) that Generates SAS Code
for Scoring the SF-36 Health Survey. Proceedings of the 22nd Annual
SAS Users Group International Conference, 1128-1132.
The executable described in this paper is no longer current, but the above article serves
as a good guide to using the following SAS code, sf36.sas,
and US general population data, sf36.raw, to analyze
[We provide the executable and other files referenced in the paper for historical
purposes only (sf36.exe, sf36.in, sf36b.exe).]
Link to the paper: http://www2.sas.com/proceedings/sugi22/POSTERS/PAPER244.PDF
An alternative set of
code that expands upon the scoring of the SF-36 version 1.0 is here (uses 1990 General
Population norms, but has the option to use 1998 norms):
Program to score the SF-36 version 1.0. All sections require SF36 version
1.0 items to be named i1-i36 and ID variable named ID (in order to merge the various
output datasets); one section (sf36b.sas) requires a variable for gender named MALE (=1 if
male, =0 if female) and a variable for AGE (continuous). Note: if you want to run
sf36b.sas, you must run sf36a.sas prior to it. Code to calculate Fryback's QWB is in
fryback.sas and is called from sf36a.sas. The %include can be commented out if not
needed. Similarly, code to calculate Nichol's HUI2 (hui2.sas) is called from
sf36c.sas and can be commented out or omitted if not needed.
Download main program (score1.sas) and its components:
randhsi.sas [RAND-36 HSI score]
sf36a.sas [SF-36 scores and optionally Fryback's Quality of Well-Being Score],
sf36b.sas [comparison to US general
sf36c.sas [SF-36 and SF-12 physical and
mental health composite scores and factors (using 1990 General Population norms) and Nichol's Health Utility Index].
randhsi.sas updated as of 6/30/2010. Thanks to Brett Larive at
Cleveland Clinic for bringing to our attention a scoring glitch when IMPUTEd
values fell on boundaries (see code for more detail and example).
Note: to use
the Means/SD's from the 1998 General
Population, download and use score1-1998.sas
and sf36c1998.sas (instead of score1.sas and sf36c.sas).
Note: QWB (not just QWB100) code
change in Fryback code (fryback.sas above) and files that call it up (6/30/2008).
Scoring the SF-12 version 2.0:
SAS code to score the SF-12 version 2.0. Assumes your items are named I1, I2a-b,
I3a-b, I4a-b, I5, I6a-c, I7. Rename them in the first data step if they are
code: sf12v2-1.sas .
output from test dataset: sf12v2-1.lst (8/24/04)
Mosier's formula: (documentation and example
#2 corrected on 3/27/2008 - no change in program itself); more explanations
in documentation added on 6/29/2010.
Mosier's formula (Mosier, C.I. (1943). On the reliability of a weighted
composite. Psychometrica, 8, 161-168.
Estimation of reliability of composite scores. Download files mosier.exe, mosier.in, and mosier2.in.
NOTE: trailing blanks MUST be removed from input file in order for
program to run successfully. You can see these trailing blanks if you
open up mosier.in in your editor (notepad, for example) and type control-A.
Mosier-input.doc is the annotated test input file and
mosier.out is the test output.
Reference for Mosier formula is mosier.JPG (taken from: Hays, Ron D. Evaluating Self-Report Data
Using Psychometric Methods. Lecture in Quality of Care Course. RAND,
Santa Monica CA: February 11, 2004. PowerPoint presentation available for download here
Hayashi, T., & Hays, R.
D. (1987). A microcomputer program for analyzing multitrait-multimethod matrices. Behavior
Research Methods, Instruments, And Computers, 19, 345-348.
Used to evaluate
multitrait-multimethod correlation matrices. Download files: mtmm.exe and mtmm.in.
Hays, R. D. & Ellickson, P. L. (1990). Longitudinal scalogram analysis: A methodology
and microcomputer program for Guttman scale analysis of longitudinal data. Behavior
Research Methods, Instruments & Computers, 22, 162-166.
Hays, R. D. & Ellickson, P.L. (1991).
Guttman scale analysis of longitudinal data: A methodology and drug use application.
International Journal of the Addictions, 25 (11A), 1341-1352.
Ellickson, P. L., Hays, R. D., & Bell, R. M. (1992).
Stepping through the drug use sequence: Longitudinal scalogram analysis of initiation and
heavy use. Journal of Abnormal Psychology, 101, 441-451.
Hays, Ron D. (1991). User's Guide for the Longitudinal Scalogram
LSA Program and test data:.
input - sample input
output - output from analysis
go.bat - batch
file that drives the following executables - requires "RAW as input)
steig.exe: tests significance of difference of paired
Note: you can run this as a Windows application, but the results will blip by you too fast
to write them down. The preferred approach is to open up a DOS window and run it
In this program, R1 and R2 are the correlations being compared to see if they are
significantly different from one another. R3 is the correlation between the
variables that are unique to R1 and R2.
For example, if the correlation between x and y (R1) is being compared to the correlation
between z and y (R2), then R3 is the correlation between x and z.
corrdiff.bas: tests significance of difference of
independent correlations (written in BASIC)
corrdiff.sas: tests significance of difference of
independent correlations (written in SAS)
hayspowe.bas: power analysis program for
hayspowe.sas: SAS 8.0 version of hayspowe.bas (8/23/04)
hayspowe.out: sample output from hayspowe.sas (8/23/04)
Power is the probability of rejecting the null hypothesis (e.g., two groups do not differ
on physical functioning) when the alternative hypothesis (the two groups differ) is true.
Here, we provide a SAS program and output Here we provide a SAS program and output from
the program that shows some common power analyses. Specifically, the program
provides the sample sizes needed to detect differences between two experimental groups
(Tables 1 and 2) and two self-selected groups (Table 3). The SAS program
(hayspowe.sas) requires as input (at end of the SAS program just after
%hayspowem) the title for the power analysis, number of scales in the
analysis, the standard deviation of each scale, and each scales label. The
output shows the sample size needed for a point difference of 2, 5, 10, and 20 points.
The example output file (hayspowe.out) provides power analysis results for three scales
(physical functioning, emotional well-being, social functioning). Table 1 indicates,
for example, that one would have 80% power (alpha = 0.05) with a sample size of 82 (41 per
group) to detect a 10-point difference in physical functioning (SD = 20.10) using a
two-tailed t-test if you had a repeated measures design and a correlation of 0.60 between
physical functioning scores at the two time points. If physical functioning were
measured only at one time point (follow-up), you would need a sample of 126 (63 per group)
to have the same power.
alphatst.exe: tests significance of difference
between alpha coefficients (see alphatst.doc
alpha.exe: updated version of alphatst.exe. Added to
website on 4/11/03.
A description of the formula used to estimate the significance of difference between alpha
coefficients can be found in the article:
Feldt, L.S., Woodruff, D.J., and Salih, F.A. (1987) Statistical Inference for Coefficient
Alpha. Applied Psychological Measurement, 11, 1, 93-103.
for the Multitrait Analysis Program (MAP)
Hays, R. D., Hayashi, T., Carson, S., & Ware,
J. E. (1988). User's Guide For The Multitrait Analysis Program (MAP). Santa
Monica, CA: The RAND Corporation, N-2786-RC. (added
8/16/2010; link updated 12/27/10)
Other odds and ends
>>Adjusting for Clustering (Non-Independence
Among Observations) using SAS - March 28, 2008.
SAS has implemented the adjustment within PROC surveyreg. Example
code appears below.
2='2:long time quitter'
3='3:dk when quit'
TITLE "New SURVEYREG with lsmeans"; run;
PROC surveyreg data=seer1;
MODEL pcs_T= male nsmoker cohort1 proxy /solution;
lsmestimate nsmoker "never smoker vs long term
quitter" [1,1] [-1,2];
lsmestimate nsmoker "recent quitter vs long term quitter" [1,4] [-1,2];
format nsmoker nsmokfmt.;
can be done in SAS
9 using PROC
ORTHOREG, GENMOD, GLIMMIX,
See sample code
black asian othrace
model outcome =
group &predictors ;
"adult UCLA vs child
UCLA" [1,1] [-1,3];
"adult UCLA vs child
HP " [1,1] [-1,4];
"child UCLA vs child
HP " [1,3] [-1,4];
>>spear.exe: applies Spearman-Brown prophecy
formula to reliability estimates
>>multi_p.sas: derived from multi.sas -
>>Guttman scaling: scalo.exe, gutt.dat,
scale.out -- this program assesses
the extent to which the items fit a response pattern that is consistent
with a Guttman scale.
>>nfact.sas: this is a macro that helps determine
the number of factors to rotate in a factor analysis. In general,
specify the maximum number of factors you might expect in the NFACT macro
to maximize the information you will get to help determine the number
of factors to rotate.
2nd step (after the "endsas") does the rotation and creates
the factor scores .
>>Information on lsmeans
>>Information on MSN messenger