In the framework of preference rankings, the interest can lie in findingwhich predictors and which interactions are able to explain the observedpreference structures. The last years have seen a remarkable owering ofworks about the use of decision tree for clustering preference vectors. Asa matter of fact, decision trees are useful and intuitive, but they are veryunstable: small perturbations bring big changes. This is the reason why itcould be necessary to use more stable procedures in order to clusteringranking data. In this work, following the idea of Bolton (2003), a ProjectionPursuit (PP) clustering algorithm for preference data will be proposed inorder to extract useful information in a low-dimensional subspace bystarting from a high but most empty dimensional space.Projection pursuit clustering is a synthesis of projection pursuit andnonhierarchical clustering methods that simultaneously attempts to clusterthe data and to find a low-dimensional representation of this clusterstructure. As introduced by Huber (1985), a PP algorithm consists of twocomponents: an index function I(α) that measures the "usefulness" ofprojection and a search algorithm that varies the projection direction so asto find the optimal projections, given the index function I(α) and the dataset X. In this work a proper specified Projection index function for discretedata will be defined: several distances will be used to evaluate distancesbetween the density of the projected data and the uninteresting uniformdensity. We also propose diagnostics for finding the optimum number ofclusters in projection pursuit clustering. All the methodology is illustratedand evaluated on one simulated and one real dataset.
|Numero di pagine||1|
|Stato di pubblicazione||Published - 2018|