In many applied fields, such as genomics, different types of data are collected on the same system, and it is not uncommon that some of these datasets are subject to censoring as a result of the measurement technologies used, such as data generated by polymerase chain reactions and flow cytometer. When the overall objective is that of network inference, at possibly different levels of a system, information coming from different sources and/or different steps of the analysis can be integrated into one model with the use of conditional graphical models. In this paper, we develop a doubly penalized inferential procedure for a conditional Gaussian graphical model when data can be subject to censoring. The computational challenges of handling censored data in high dimensionality are met with the development of an efficient expectation-maximization algorithm, based on approximate calculations of the moments of truncated Gaussian distributions and on a suitably derived two-step procedure alternating graphical lasso with a novel block-coordinate multivariate lasso approach. We evaluate the performance of this approach on an extensive simulation study and on gene expression data generated by RT-qPCR technologies, where we are able to integrate network inference, differential expression detection and data normalization into one model.
|Numero di pagine||17|
|Rivista||Statistics and Computing|
|Stato di pubblicazione||Published - 2020|
All Science Journal Classification (ASJC) codes