**Example from SPSS
Users' Guide, Chapter 12, page 198**

- The SPSS User's Guide describes regression analysis using a format that differs from the usual one.
- Most texts use Roman
letters in the linear regression equation:
**Y**_{i }=**a**+**b**X_{i }+**e**_{i }- The
**e**_{i }term expresses the positive and negative residuals for values of Y_{i }that deviate above and below the regression line - Note that the sum
of all positive and negative residuals equals 0 --
that is
**e**_{i}= 0

- The

- However, the
*Users' Guide*[regrettably] employs a different notation, which appears on page 195:**Y**_{i }=**ß**_{0 }+**ß**_{i }X_{i }+- The
*Users' Guide*uses Greek letters because it views the regression coefficients as*estimates*of the population parameters.- This practice leads to confusion:
- It differs from the practice of most (but not all) beginning texts that discuss regression
- It gives no
good term for what SPSS later refers to as
**standardized**regression coefficients, which it calls**betas** - Unfortunately,
those
**betas**are not the**ß**s

- The

- You will have to live
with this practice in the
*Users' Guide*

The text then says:

- The estimates of the
model coefficients
**ß**_{0}**ß**_{1}(slope) are, respectively, 47.17 and 0.307. So the estimated model is :- female life expectancy = 47.17 + 0.307 x female literacy

- But note the disjunction
between the text and the table:
- In the table, 47.170 (intercept or constant) lies under the B heading--which itself lies under "Unstandardized Coefficients"
- Under that is the
value .307, which is also under the B heading,
althought it is the
**ß**_{1 }coefficient. - In the next column,
headed "Standardized Coefficients," is the
**Beta**value of .819 -- which is not mentioned in the model.

- The point is that the
SPSS output is badly labeled, and you need to understand
these points:
- Regardless of what
SPSS says in its heading,
- interpret the
**constant**as the**intercept**, which you learned as**a**. - interpret the
Unstandardized Coefficients under the B heading as
the
**b**coefficients

- interpret the
- Interpret the
Standardized Coefficients, Beta, as the
**b**cofficients for the same data--- after all the
independent variables and dependents variables have
been
**standardized**-- transformed into z-scores.

- after all the
independent variables and dependents variables have
been

- Regardless of what
SPSS says in its heading,

- Here is the graph of the
data that the
*Users' Guide*really should have presented at this point. - This model, which
explains female life expectancy as a function of women's
literacy, has this substantive interpretation:
- Intercept =
47.17
- Given a society in which 0% of the women could read, the expected life expectancy would be 47.17

- Slope = .31
- Starting from the an expected life span of 47.17 in societies in which 0% of the women could read, each 1 percentage point increase in female literacy tends to increase life expectancy by .31 years.

- R-Square = .67
- 67% of the variation in female life expectancy in the world's nations can be explained by female literacy.

- Intercept =
47.17

First, let's consider the means and standard deviations of the variables that we want to correlate:

- Here's the basic regression output:

- Here's the associated scatterplot:

- What's wrong with this
plot?
- The relationship is not linear--but you could not tell this from the table above.
- GNP per capita is a highly skewed variable that needs to be transformed to produce a normalized distribution.
- Note that the slope is 0.00 -- the slope is so small because $1 in GDP per capita does not buy much life expectancy.

- Note the improvement in
the fit:
- The percent of explained variation rises from 41% in the untransformed GDP per capita to 69% using the log of GDP per cap.
- The slope now has a value of 14.17, not 0.00 as the in previous model.
- The slope can be interpreted as follows: for each 10-fold increase in GDP per capita, female life span increases by 14.17 years.