OMGT 6683 - Handout 3
Heteroscedasticity

One of the primary assumptions of the regression model is that of homoscedasticity. This assumption states that the variance of the error term of the regression model is the same (constant) for all values of X ( or all observations). Therefore, the regression procedure assumes that the expected variability of each individual observation is the same and assigns the same weight to each observation when minimizing the SSE.

If this assumption is violated, heteroscedasticity, or unequal variance is present. If the problem is not corrected, use of the regression results will affect the accuracy of confidence and prediction intervals, as well as the outcome of hypothesis tests. Heteroscedasticity does not affect the unbiasedness of the regression coefficients; however, it does affect standard errors. The latter are larger than they would be under homoscedastic situations, and thus are not as efficient, especially when the sample size is small.
Heteroscedasticity occurs when the response follows a distribution in which the variance is functionally related to the mean. Often, but not always, nonnormality also occurs. In these instances, identification of the distribution of the residuals allows the researcher to transform the data and correct for the problem. Handout 4 gives three common examples of heteroskedasticity plots.

Heteroscedasticity is often observed in cross-sectional studies where, as an example, the variation in consumption expenditures ( C ) grows larger as the level of income (Y) rises. Graphically, this situation looks like graph (c ) on handout 4. The underlying reason is that when we use a sampling method to obtain data, it is possible that the variances of the income subgroups in a population may be different. A common example of this is in the case of sampling across high-income and low-income sections of a city. Historically, the spending behavior of the rich and poor can differ as exhibited by the large and small variances of the consumption variable (C ). Among the high income people, the spending pattern may vary greatly, reflecting their many alternative ways to spend their income, while among the low income people, most income would be spent for daily necessities, thus resulting in a smaller variance.

Sources:

(1)Applied Linear Regression Models, Neter and Wasserman and Kutner, Irwin Publishing, 1983.

(2)Data Analysis Using Regression Models: The Business Perspective, Edward W. Frees, Prentice Hall, 1996.

(3)Regression Analysis: Statistical Modeling of a Response Variable, Freund and Wilson, Academic Press, 1998.