 Bivariate
Distributions: Strength, Form,
Significance
Statistical methods to use when both variables are
DISCRETE and NONORDERABLE
 Suitable SPSS procedure
under the Analyze Menu and under Descriptive
Statistics is crosstabs
 The "Cells" button in
crosstabs computes percentages for cell entries
based on three different totals:
 by Column
 appropriate when the independent variable is in
the columnsthe usual case
 by Row 
appropriate when the row variable is treated as the
independent variable
 by Total of
all the cases in the tableused only under special
circumstances of analysis
 The "Statistics"
button in crosstabs leads to boxes for several
types of statistics:
 chisquare
for tests of independence between the variables in
the table
 familiar Pearson
bivariate correlations between the
variables
 Measures of
association for nominal data
 contingency
coefficient
 Phi
and Cramer's V
 Lambda
 Uncertain
coefficient (which we did not
cover)
 Measures of
association for ordinal data
 Gamma
 Somer's
d
 Kendall's
Taub
 Kendall'es
Tauc
 A measure of
association for interval data by
nominal classification:
eta
 Other assorted
statistics that we didn't cover.
 Strength of the
relationship can be measured by these measures, none of
which we really studied.
 Lambda, offers
the best PRE interpretation for predicting to
the dependent variable from knowledge of the
independent
 Contingency
Coefficient, C, which is based on chisquare, but
has no operational interpretation, cannot reach 1.0,
and cannot be compared across tables of different
size
 Cramer's V,
another chisquare measure, which can be compared
across tables of different size and ranges between 0
and 1.0 but has no PRE interpretation
 Phi, , yet
another chisquare based measure that does range
between 0 and 1.0 but is suitable only for 2x2
tables
 Form of the
relationship between two nominal variables can be
determined only by inspection of the cell entries to see
where cases cluster and where they are
absent.
 Significance of
the relationship is best determined by
 Chisquare,
X^{2}which measures the
difference between observed and expected cell
frequencies. Significance is tested by entering the
chisquare table for appropriate degrees of freedom.
go
to example
 The
significance of lambda is calculated by SPSS,
using a complex formula for lambda's standard
error.
Statistics for variables that are CONTINUOUS or DISCRETE
ORDERABLE
 Suitable SPSS
procedures
 crosstabs
(requesting the Pearson correlation
coefficient)
 correlate
 scatterplot
under the Graph Menu > Interactive
 Strength
can be measured by the Pearson product moment
correlation, r, which can be calculated by several
formulas.
 Actually, the
strength of a correlation is expressed by
r^{2}, called the coefficient of
determination.
 It can be interpreted
as the proportion of variance in the dependent
variable that is "explained" by the independent
variable.
 Of course,
correlation does not mean causation, so explanation is
assessed theoretically rather than proved.
 Form can be
measured by the regression equation for the raw
data.
 If the data are in
standardized form, the b coefficient becomes a
standardized beta coefficient and is equal to
the correlation coefficient.
 The intercept
then becomes zero and drops out of the
equation.
 Significance can
be tested easily against the null hypothesis by
calculating a t statistic.
 The test is much more
complicated when a nonzero r is hypothesized, but we
did not take up that situation.
 Then the observed and
expected r's must be transformed into something called
"Fisher's Z" and a different test
used.
 You do not need to
know this procedure.
Statistical methods used when the dependent variable is
CONTINUOUS or ORDERED DISCRETE and the independent variable
is NOMINAL
 Suitable SPSS
procedures under the Analyze Menu, then Compare
Means
 one sample
ttest
 means
 OneWay ANOVA
 Significance can
be determined by the
 ttest, when
the nominal variable is a dichotomy and the
test become only one for the difference between two
means.
 The appropriate
form of the ttest depends on
 the
independence of the two samples, use the
Independent samples T test
 if the cases
are "matched" and then tested, use the
Paired samples T test
 the
"equality" (or "monogeneity") of the
variance for the dependent variable in each
sample
 Levene's
test for unequal variance directs you to
consult
 either the
line for equal variance, which uses a
"pooled" variance estimate
 or the line
for unequal variance, which uses a "separate"
estimate
 usually, the
two estimates will produce similar results
regarding the null hypothesis.
 Ftest, which
follows from the analysis of variance as a
generalization of the difference of means test
(ttest) to k groups.
 In fact,
F=tsquared for 1 df between groups.
 The Ftest is ratio
of the mean sums of squares calculated between groups
and that calculated within groups.
 Put another way,
it is the ratio of the mean between SS to the mean
within SS.
 Strength can be
measured by etasquared, which is a measure of the
explained variation (between group SS) divided by
the total variation (total SS). Thus it is
analogous to the product moment correlation and is indeed
equal to it in the special case of a dichotomous variable
being correlated withan interval variable.
 Form can be
determined only by graphing the means of the dependent
variable for each category and seeing which go up and
which go down.
