Exploratory Factor Analysis of Pathway Copy Number Data with an Application Towards the Integration with Gene Expression Data

Research output: Contribution to journalArticleAcademicpeer-review

4 Citations (Scopus)


Realizing that genes often operate together, studies into the molecular biology of cancer shift focus from individual genes to pathways. In order to understand the regulatory mechanisms of a pathway, one must study its genes at all molecular levels. To facilitate such study at the genomic level, we developed exploratory factor analysis for the characterization of the variability of a pathway's copy number data. A latent variable model that describes the call probability data of a pathway is introduced and fitted with an EM algorithm. In two breast cancer data sets, it is shown that the first two latent variables of GO nodes, which inherit a clear interpretation from the call probabilities, are often related to the proportion of aberrations and a contrast of the probabilities of a loss and of a gain. Linking the latent variables to the node's gene expression data suggests that they capture the "global" effect of genomic aberrations on these transcript levels. In all, the proposed method provides an possibly insightful characterization of pathway copy number data, which may be fruitfully exploited to study the interaction between the pathway's DNA copy number aberrations and data from other molecular levels like gene expression. © Copyright 2011, Mary Ann Liebert, Inc. 2011.
Original languageEnglish
Pages (from-to)729-741
JournalJournal of Computational Biology
Issue number5
Publication statusPublished - 2011

Cite this