Data heterogeneity, within a (linear) regression framework, often suggests the use of a Clusterwise Linear Regression (CLR) procedure, which implies, among other things, the estimate of the appropriate number of clusters as well as the cluster membership of each unit. The approaches to the estimation of a CLR model are essentially based on the Ordinary Least Square (OLS) criterion or the likelihood criterion. In this paper, in a context of OLS approach, we propose an estimation of the model making use of an algorithm based on a threshold criterion for the determination coefficient of each cluster, to identify the appropriate number of clusters, and of a modified Spath's algorithm, to estimate the cluster membership of each sample unit. A simulation design and an application to a real data-set show that the procedure outperforms other algorithms commonly used in the literature.
|Number of pages||17|
|Journal||STATISTICA & APPLICAZIONI|
|Publication status||Published - 2009|
All Science Journal Classification (ASJC) codes
- Statistical and Nonlinear Physics
- Statistics and Probability
- Statistics, Probability and Uncertainty