Path: janda.org/c10 > Syllabus > Topics > Topic VIII: Modeling Relationships
 VIII. MODELING RELATIONSHIPS AMONG MULTIPLE VARIABLES NOVEMBER 15 MULTIPLE REGRESSION SPSS Applications Guide, Ch. 12, "Simple and Multiple Linear Regression," only pp. 189-210 (up to "Optional Plots and Diagnostics") This chapter in the SPSS Guide is pretty good, quite different from most other chapters. The first few pages review simple linear regression. Pay special attention to pages 192-193, which show how you can be mislead by a regression equation if you don't plot your dependent and independent variables. You will have covered all concepts to page 199 (but you should restudy the std. error of the estimate on page 198). Note that SPSS uses ANOVA (rather than the t-test) to test for significance of the multiple correlation, R, in regression analysis. While ANOVA and t are related, ANOVA allows for more than one independent variables used in multiple regression. Note also that SPSS uses t to test for significance of individual variables in the equation. (Try to keep this distinction clear: F for the multiple R, but t for significance of individual variables.) Page 199 also introduces standardized coefficients (betas) to be distinguished from the "unstandardized" b coefficients that we already covered. The Guide also discusses residual statistics at length; we'll stop short of what SPSS offers. Multiple regression is one of the most powerful and popular techniques of multivariate analysis in social and political science. It finds the best linear and additive combination of a set of independent variables for predicting to a single dependent variable according to the "least squares" principle of best fit. The technique assumes that the data are interval in character, but it is frequently applied to ordinal data and even nominal independent variables -- if they are reduced to dichotomies (values of 0 and 1) and treated as "dummy" variables. There are some problems associated with these departures from the classical measurement assumptions, but sensible departures seem worth the risks, for the technique is so revealing in its analysis. We will concentrate mainly on the capabilities of regression analysis, rather than on its limitations, which are treated in courses devoted to multivariate analysis. Assignment: Follow the discussion by doing the SPSS procedures the Guide discusses for examples 1 & 2. NOVEMBER 16 MULIPLE REGRESSION ANDTHE REAGAN VOTE SPSS Applications Guide, Ch. 12, "Simple and Multiple Linear Regression," only pp. 215-230. The Guide continues with multiple regression by describing two different methods for selecting independent variables for inclusion in the model. Backward elimination enters all the available independent variables on the first step and then winnows out those that don't help with the explanation. Stepwise works in the opposite direction: it uses the strongest predictor variable first and then adds others that help the prediction. Both methods are, in essence, substitute for clear theory. If your theory is clear and you want to test a model using variables derived from theory, then use the Enter method. Researchers seldom have such clear theoretical expections in their minds, so they use whatever crutches they can. Concentrate on stepwise; my examples will use it. NOVEMBER 19 USING REGRESSION IN VOTING RESEARCH Research progress reports are due! We continue our discussion of multiple regression, moving to the point where you can do a multiple regression analysis of your own. Assignment: Using the states2000 file, choose five variables that you think might provide the best explanation of the states' vote for Bush in 2000. Use your head (i.e., theory) to pick the first five variables, and use the computer to winnow the five down to three (or fewer). This assignment will test your understanding of the SPSS regression procedure. Use Stepwise as your method for choosing your variables. NOVEMBER 20 MULTIPLE REGRESSION IN COMPARATIVE RESEARCH R2 v. Eta2:: MULTIPLE REGRESSION v. ANALYSIS OF VARIANCE Alan Wells, "The Coup d'Etat in Theory and Practice: Independent Black Africa in the 1960s," American Journal of Sociology, 79 (1974), 871-887 (on our website). Robert W. Jackman, "The Predictability of Coups d'Etat: A Model with African Data," American Political Science Review, 72 (Dec, 1978), 1262-1275 (on our website). The Wells article uses a dummy variable in a minor way. I will give a more general discussion in class. Jackman's article is much more sophisticated that anything we have read to now. Compare his analysis to that by Wells on the same topic. Which is more convincing and why? You will not understand everything in Jackman, but you should at least be familiar with most of the statistical methodology. (For those of you who are especially interested in this topic and who wish to explore it further, see the exchange of views, "Explaining African Coups d'Etat," American Political Science Review, 80 (March, 1986), 225-249.) If you can decipher the Wells and Jackman articles now (and couldn't before the course began), you have come a very long way in a few short weeks. Congratulations, you should not be able to understand 75 percent of the quantitative articles that you are likely to encounter in mainstream political science and sociology journals. If you want to understand more, take a more advanced in statistical analysis offered in political science or elsewhere in the social sciences. Having seen that dummy variables can be used in regression analysis, you may have suspected that there is a more general relationship between regression analysis and analysis of variance. There is, and I will demonstrate their similarities in class. Assignment: Here is a special challenge: Use SPSS, to create "dummy" variables (those which assume a value of either 0 or 1) for 3 of the 4 regions in the American states file. Use these dummy variables to predict Clinton's 1996 vote by each of the regions. This amounts to using a qualitative variable (region) in regression analysis. Then use means (or oneway) to analyze the vote for president with the single variable, region. Compare the two results for similarities and differences. NOVEMBER 21 No optional session before Thanksgiving break