Definitions

# Homoscedasticity

In statistics, a sequence or a vector of random variables is homoskedastic if all random variables in the sequence or vector have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity. The alternative spelling homo- or heteroscedasticity is equally correct and is also used frequently.

As described by Joshua Isaac Walters, "the assumption of homoskedasticity simplifies mathematical and computational treatment and usually leads to adequate estimation results (e.g. in data mining) even if the assumption is not true." Serious violations in homoskedasticity (assuming a distribution of data is homoskedastic when in actuality it is heteroskedastic) result in underemphasizing the Pearson coefficient.

In a scatterplot of data, homoskedasticity looks like an oval (most x values are concentrated around the mean of y, with fewer and fewer x values as y becomes more extreme in either direction). If a scatterplot looks like any geometric shape other than an oval, the rules of homoskedasticity may have been violated.

## Assumptions of a regression model

In simple linear regression analysis, one assumption of the fitted model is that the standard deviations of the error terms are constant and do not depend on the x-value. Consequently, each probability distribution for y (response variable) has the same standard deviation regardless of the x-value (predictor). In short, this assumption is homoskedasticity.

## Testing

Residuals can be tested for homoskedasticity using the Breusch-Pagan test, which regresses square residuals to independent variables. The BP test is sensitive to normality so for general purpose the Koenkar-Basset or generalized Breusch-Pagan test statistic is used. For testing for groupwise heteroskedasticity, the Goldfeld-Quandt test is needed.

## Homoskedastic distributions

Two or more normal distributions, $N\left(mu_i,Sigma_i\right)$, are homoskedastic if they share a common covariance (or correlation) matrix, $Sigma_i = Sigma_j, forall i,j$. Homoskedastic distributions are especially useful to derive statistical pattern recognition and machine learning algorithms. One popular example is Fisher's linear discriminant analysis.