A Monte Carlo investigation of two distance measures between statistical populations and their application to cluster analysis
MetadataShow full item record
The paper deals with a simulation study of one of the well-known hierarchical cluster analysis methods applied to classifying the statistical populations. In particular, the problem of clustering the univariate normal populations is studied. Two measures of the distance between statistical populations are considered: the Mahalanobis distance measure which is defined for normally distributed populations under assumption that the covariance matrices are equal and the Kullback-Leibler divergence (the so called Generalized Mahalanobis Distance) the use of which is extended on populations of any distribution. The simulation study is concerned with the set of 15 univariate normal populations, variances of which are chanched during successive steps. The aim is to study robustness of the nearest neighbour method to departure from the variance equality assumption when the Mahalanobis distance formula is applied. The differences between two cluster families, obtained for the same set of populations but with the different distance matrices applied, are studied. The distance between both final cluster sets is measured by means of the Marczewski-Steinhaus distance.