TY - JOUR

T1 - Functional principal component analysis for multivariate multidimensional environmental data

AU - Di Salvo, Francesca

AU - Ruggieri, Mariantonietta

AU - Plaia, Antonella

PY - 2015

Y1 - 2015

N2 - Data with spatio-temporal structure can arise in many contexts, therefore a considerable interest in modelling these data has been generated, but the complexity of spatio-temporal models, together with the size of the dataset, results in a challenging task. The modelization is even more complex in presence of multivariate data. Since some modelling problems are more natural to think through in functional terms, even if only a finite number of observations is available, treating the data as functional can be useful (Berrendero et al. in Comput Stat Data Anal 55:2619–2634, 2011). Although in Ramsay and Silverman (Functional data analysis, 2nd edn. Springer, New York, 2005) the case of multivariate functional data is also contemplated, they do not cope with more than one dimension (only time is considered as covariate) in the domain of the considered functions. In estimating functional data through smoothing methods, a proper framework for incorporating space-time structures can be found in the generalized additive models (GAM), while classical dimension reduction techniques for functional data lead to functional principal component analysis (FPCA). In a previous work Ruggieri et al. (J Appl Stat 40(4):795–807, 2013) extended temporal FPCA, that is FPCA on data modelled as functions of the one-dimensional time, to multivariate (more than one variable, in our case pollutants) context. In this paper the computational aspects of FPCA are extended to more than one dimension: space (long, lat) and/or space-time; moreover, multidimensional (spatial, spatio-temporal) FPCA is extended to multivariate case. In order to provide a generalization of FPCA to multidimensional (spatio-temporal) and simultaneously multivariate data, we link GAM models together with the approach proposed by Ramsay and Silverman (2005). The paper describes all the computational details useful to implement this approach, while its effectiveness will be shown by a multivariate spatio-temporal environmental dataset, whose structure is actually very common in literature.

AB - Data with spatio-temporal structure can arise in many contexts, therefore a considerable interest in modelling these data has been generated, but the complexity of spatio-temporal models, together with the size of the dataset, results in a challenging task. The modelization is even more complex in presence of multivariate data. Since some modelling problems are more natural to think through in functional terms, even if only a finite number of observations is available, treating the data as functional can be useful (Berrendero et al. in Comput Stat Data Anal 55:2619–2634, 2011). Although in Ramsay and Silverman (Functional data analysis, 2nd edn. Springer, New York, 2005) the case of multivariate functional data is also contemplated, they do not cope with more than one dimension (only time is considered as covariate) in the domain of the considered functions. In estimating functional data through smoothing methods, a proper framework for incorporating space-time structures can be found in the generalized additive models (GAM), while classical dimension reduction techniques for functional data lead to functional principal component analysis (FPCA). In a previous work Ruggieri et al. (J Appl Stat 40(4):795–807, 2013) extended temporal FPCA, that is FPCA on data modelled as functions of the one-dimensional time, to multivariate (more than one variable, in our case pollutants) context. In this paper the computational aspects of FPCA are extended to more than one dimension: space (long, lat) and/or space-time; moreover, multidimensional (spatial, spatio-temporal) FPCA is extended to multivariate case. In order to provide a generalization of FPCA to multidimensional (spatio-temporal) and simultaneously multivariate data, we link GAM models together with the approach proposed by Ramsay and Silverman (2005). The paper describes all the computational details useful to implement this approach, while its effectiveness will be shown by a multivariate spatio-temporal environmental dataset, whose structure is actually very common in literature.

KW - Functional principal component analysis; Generalized additive models; Multivariate spatio-temporal data; P-splines; 2300; Statistics

KW - Probability and Uncertainty; Statistics and Probability

KW - Functional principal component analysis; Generalized additive models; Multivariate spatio-temporal data; P-splines; 2300; Statistics

KW - Probability and Uncertainty; Statistics and Probability

UR - http://hdl.handle.net/10447/150014

M3 - Article

VL - 22

SP - 739

EP - 757

JO - Environmental and Ecological Statistics

JF - Environmental and Ecological Statistics

SN - 1352-8505

ER -