TY - JOUR
T1 - Exploratory Factor Analysis of Pathway Copy Number Data with an Application Towards the Integration with Gene Expression Data
AU - van Wieringen, W.N.
AU - van de Wiel, M.
PY - 2011
Y1 - 2011
N2 - Realizing that genes often operate together, studies into the molecular biology of cancer shift focus from individual genes to pathways. In order to understand the regulatory mechanisms of a pathway, one must study its genes at all molecular levels. To facilitate such study at the genomic level, we developed exploratory factor analysis for the characterization of the variability of a pathway's copy number data. A latent variable model that describes the call probability data of a pathway is introduced and fitted with an EM algorithm. In two breast cancer data sets, it is shown that the first two latent variables of GO nodes, which inherit a clear interpretation from the call probabilities, are often related to the proportion of aberrations and a contrast of the probabilities of a loss and of a gain. Linking the latent variables to the node's gene expression data suggests that they capture the "global" effect of genomic aberrations on these transcript levels. In all, the proposed method provides an possibly insightful characterization of pathway copy number data, which may be fruitfully exploited to study the interaction between the pathway's DNA copy number aberrations and data from other molecular levels like gene expression. © Copyright 2011, Mary Ann Liebert, Inc. 2011.
AB - Realizing that genes often operate together, studies into the molecular biology of cancer shift focus from individual genes to pathways. In order to understand the regulatory mechanisms of a pathway, one must study its genes at all molecular levels. To facilitate such study at the genomic level, we developed exploratory factor analysis for the characterization of the variability of a pathway's copy number data. A latent variable model that describes the call probability data of a pathway is introduced and fitted with an EM algorithm. In two breast cancer data sets, it is shown that the first two latent variables of GO nodes, which inherit a clear interpretation from the call probabilities, are often related to the proportion of aberrations and a contrast of the probabilities of a loss and of a gain. Linking the latent variables to the node's gene expression data suggests that they capture the "global" effect of genomic aberrations on these transcript levels. In all, the proposed method provides an possibly insightful characterization of pathway copy number data, which may be fruitfully exploited to study the interaction between the pathway's DNA copy number aberrations and data from other molecular levels like gene expression. © Copyright 2011, Mary Ann Liebert, Inc. 2011.
U2 - https://doi.org/10.1089/cmb.2009.0209
DO - https://doi.org/10.1089/cmb.2009.0209
M3 - Article
C2 - 21554018
SN - 1066-5277
VL - 18
SP - 729
EP - 741
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 5
ER -