Path: Janda's Home Page > Workshop > Discriminant Analysis > Outline

Linear V. Logistic Regression
 
  • Problems in moving from quantitative (continuous) data to qualitative (categorical) data
  • The common case of a dichotomous dependent variable--voting for candidates A or B.
    • Standard regression produces a linear probability model
      • Coefficients are read as probabilities--increasing or decreasing the dependent variable.
      • These estimates are unbiased but inefficient, actually instable.
      • They can range nonsensically above 1.0 or below 0.
    • Alternative, nonlinear forms for probability exist.
  • Logistic transformations of proportions, p
    • A logistic probability unit--logit--is computed by taking the natural logarithm of the ratio of pi to its reciprocal 1-pi, that is Li = loge (pi / 1-pi).
    • The resulting value, the logit, is symmetrically distributed around the central value of p=.50.
      • When p=.50, the value of the log =0.
      • But as p departs from .5 in either direction, the corresponding logit values depart from 0 (positive and negative) at an increasing rate.
      • Thus, as a dichotomous variable becomes skewed in either direction, the nonlinear function of the logistic transformation differs dramatically from the linear function.
      • Moreover, logit transformations are undefined when pi = 0 or 1.
    • Knoke and Bohrnstedt (1994) graph linear probability against logistic regression (p. 340):