Word of the Month: Heteroscedasticity (and Homoscedasticity)

Home >> Technology >> Data Science >> Word of the Month: Heteroscedasticity (and Homoscedasticity)

In Linear Regression Residual Analysis heteroscedastic results mean that the variance in errors is not consistent (see: Graph 1 and 2), which is what a good linear regression model should show — a good random scattering, showing no particular pattern. This is called, homoscedasticity (see: Graph 3).

  • residual-analysis-scatterplot-heteroscedastic
  • residual-analysis-scatterplot-heteroscedastic
  • residual-analysis-scatterplot-homoscedastic

If your residual analysis results look like this then the model is not a good fit. To fix this, one could perform a data transform, or add a variable to the model to help account for what is the cause between the relationship of errors and input values.

In the example above for Graph 1 and 2, this could be the number of people at a table or the time of day — since larger groups sometimes tip less because they assume everyone else will tip, or people are more generous later in day after some vino in the evening!

But remember, “Essentially, all models are wrong, but some are useful.”

Now that’s what I call statistical bombasticity!

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: