Path: janda.org/c10 > Syllabus > Topics and Readings > Modeling Relationships > Multiple Regression > Meaning of R

 The Meaning of R: It is the bivariate r between actual Yi and the Yi predicted by the regression equation Predicting Female Life Expectancy using two variables in multiple regression Yesterday, we saw the separate effects of two variables on female life expectancy This graph showed that Female Literacy explained 67% of the variance And this graph showed that the wealth of a society, measured by the logarithm of GDP per capita, explained 69% of the variance: Suppose we used both variables, female literacy and society's wealth, to explain female life span? We can't simply add together their separate explanations of variation -- 67% + 69% = 136% It makes no sense to explain more than 100% of the variance. One can't add together their explained variance, for female literacy and social wealth are themselves correlated at .632 That means that the two variables are sharing the variation that they explain. We can use muliple regression analysis to separate their explanations. Here's the result of that multiple regression analysis, first the overall summary, with the R and R2: This box shows that the combined effects of the two variables increased the variance explained to 80%. The ANOVA box shows that the multiple correlation, R, is significant far beyond the .05 level, for two variables and 85 cases. The box above reports separate t test for the variables in the equation, which indicate that each is significant far beyond .05. Here is the final regression equation, built from information in the box above: Y = 26.229 + 8.738*Log GDP_CAP + .197*FemaleLiteracy To reproduce the multiple R between the actual life span and that predicted by the above equation by computing the estimated value from the equation, using "Compute" under the Transform Menu in SPSS 10. Then we use the new variable Estimate in a simple linear scatterplot against Female Life Expectancy: Note that this R2 is exactly equal to the R2 from the multiple regression analysis. Thus, the R for a multiple regression equation is equal to the simple r computed between the original dependent variable and the estimated variable predicted by the regression equation.