Path: janda.org/c10 > Syllabus > Topics and Readings > Modeling Relationships > Regression and Analysis of Variance
Regression &ANOVA
Relationship of ANALYSIS OF VARIANCE to REGRESSION ANALYSIS
• Analysis of variance is suited for
• A continuous dependent variable
• A discrete independent variable
• Regression analysis is suited for
• A continuous dependent variable
• A series of continuous independent variables
• But these independent variables can be dummy variables
• So-called "Dummy" variables take two values: 1 and 0 -- for SOUTH and NON-SOUTH.
• Coefficients for dummy variables can be interpreted as the effect of the variable when it is present.
• What happens if the several categories of a discrete variable in analysis of variance are made into k-1 dummy independent variables and run through regression analysis?
Analysis of variance in vote for Reagan in 1984 by REGION

- - - - - - - - - A N A L Y S I S - O F - V A R I A N C E - - - - - - - -

 VALUE LABEL MEAN STD DEV SUM OF SQ CASES 1 NORTHEAST 57.7778 5.6960 259.5556 9 2 NORTH CENTRAL 60.0833 5.9918 394.9167 12 3 SOUTH 58.2941 12.2666 2407.5294 17 4 WEST 63.5385 6.6785 535.2308 13 WITHIN GROUPS TOTAL 59.9608 8.7485 3597.2324 51

 SOURCE SUM OF SQUARES D.F. MEAN SQUARE F SIG. BETWEEN GROUPS 256.6892 3 85.5631 1.1179 0.351 WITHIN GROUPS 3597.2324 47 76.5369 Eta=.2581 Eta2 = .0666
Applying Regression Analysis with Dummy Variables to REAGAN84

First, create three "variables"--Northeast, South, West--and set them to 0
COMPUTE NRTHEAST = 0
COMPUTE SOUTH = 0
COMPUTE WEST = 0

Using the IF command, set the value equal to 1 if state is in the region.
IF (REGION = 1) NRTHEAST = 1
IF (REGION = 3) SOUTH = 1
IF (REGION = 4) WEST = 1

Run regression using these dummy variables
REGRESSION VARIABLES = REAGAN84 NRTHEAST SOUTH WEST
/DEPENDENT = REAGAN84/ ENTER /

* * * * * * * * * * M U L T I P L E R E G R E S S I O N * * * * * * * * *
EQUATION NUMBER 1 DEPENDENT VARIABLE.. REAGAN84 PCT VOTE FOR REAGAN,
1.. WEST
2.. NRTHEAST
3.. SOUTH ELEVEN STATES OF THE CONFEDERACY
 MULTIPLE R 0.25808 R SQUARE 0.0666 ADJUSTED R SQUARE 0.00703 STANDARD ERROR 8.74853 ANALYSIS OF VARIANCE DF SUM OF SQUARES MEAN SQUARE REGRESSION 3 256.68917 85.56306 RESIDUAL 47 3597.2324 76.53686 F=1.1179 SIGNIF F = .3513 ------------------ VARIABLES IN THE EQUATION ------------------ VARIABLE B SE B BETA T SIG. WEST 3.45513 3.50222 0.17322 0.987 0.3289 NRTHEAST -2.30556 3.85774 -0.10111 -0.598 0.5529 SOUTH -1.78922 3.29852 -0.09703 -0.542 0.5901 (CONSTANT) 60.08333 2.52548 23.791 0
Computing the regression equations:
 if state is in the WEST = 60.08333 + 3.45513 - 0 - 0 = 63.5385 NrthEast = 60.08333 + 0 - 2.30556 - 0 = 57.7778 SOUTH = 60.08333 + 0 - 0 - 1.78922 = 58.2941 Nrth Centrl = 60.08333 + 0 - 0 - 0 = 60.0833

These computed means are identical
to the means produced in the top table from ANOVA.

Thus, analysis of variance and regression analysis produce the same results when applied to exactly the same problem viewed as a single DISCRETE variable or as k-1 DUMMY variables.
Using SPSS to create "dummy" variables
• "Dummy" variables are dichotomous renditions of the absence or presence of some qualitative attribute
• For example, "region," "sex," "race," "party," etc.
• These qualitative attributes are typically coded "1" to indicate the presence of the trait, and "0" to indicate its absence.
• Then the "dummy" variable can be used in multiple regression as an independent variable.
• It will function as a switch, turning on if the trait is present (= 1), and turnning off if it is absent (=0).
• SPSS syntax command procedures (you can do the equivalent using the Transform Menu and then Compute
• Choose a name for your variable and set all cases equal to 0
COMPUTE SOUTH = 0
• Use "IF" command to code the cases as you desire
IF (REGION = 3) SOUTH = 1
• All cases that are not in Region 3 will be left = 0.
• SOUTH becomes available to use in any subsequent run.
• You can use thie procedure with either of the cross-national data sets to treat any region as a dummy variable.
• Check your results by running Frequencies on all "dummy" variables you create.