User:Se187r/sandbox

Cluster analysis and multidimensional scaling
"For some multivariate techniques such as multidimensional scaling and cluster analysis, the concept of distance between the units in the data is often of considerable interest and importance … When the variables in a multivariate data set are on different scales, it makes more sense to calculate the distances after some form of standardization."

Principal components analysis
In principal components analysis, "Variables measured on different scales or on a common scale with widely differing ranges are often standardized."

Relative importance of variables in multiple regression: Standardized regression coefficients
Standardization of variables prior to multiple regression analysis is sometimes used as an aid to interpretation. Affif et al (page 95) state the following.

"The standardized regression slope is the slope in the regression equation if X and Y are standardized… Standardization of X and Y is done by subtracting the respective means from each set of observations and dividing by the respective standard deviations… In multiple regression, where several X variables are used, the standardized regression coefficients quantify the relative contribution of each X variable."

However, Kutner et al. (p 278) give the following caveat. "… one must be cautious about interpreting any regression coefficients, whether standardized or not. The reason is that when the predictor variables are correlated among themselves, … the regression coefficients are affected by the other predictor variables in the model … The magnitudes of the standardized regression coefficients are affected not only by the presence of correlations among the predictor variables but also by the spacings of the observations on each of these variables. Sometimes these spacings may be quite arbitrary. Hence, it is ordinarily not wise to interpret the magnitudes of standardized regression coefficients as reflecting the comparative importance of the predictor variables."