Term: Cluster Analysis

Glossary Definition

Last Updated: 2011-02-22

Definition:

"A set of statistical methods used to group variables or observations in strongly interrelated subgroups" (Last, 2001).

The process starts with each person/object as an individual cluster, then groups items that are most similar and gradually relaxes the grouping criteria until one overall group is formed. Unlike traditional statistics, cluster analysis does not calculate the ideal number of statistically different groups, but relies on people, using both mathematical and context specific knowledge, to decide when the clustering technique should stop. Once the number of groups is decided, cluster analysis provides a statistical (pseudo R 2 ) value to describe the mathematical "fit" of the overall model. Like traditional R 2 values provided by ordinary least squared regression, pseudo R 2 values are largest when people within each group (cluster) are very similar, and when the differences between groups are large.

Related terms 

References 

Term used in