Comparing transformation methods for DNA microarray data

Helene H. Thygesen; Aeilko H. Zwinderman

doi:https://doi.org/10.1186/1471-2105-5-77

Comparing transformation methods for DNA microarray data

Helene H. Thygesen, Aeilko H. Zwinderman

Epidemiology and Data Science (AMC)

Research output: Contribution to journal › Article › Academic › peer-review

9 Citations (Scopus)

Abstract

Background: When DNA microarray data are used for gene clustering, genotype/phenotype correlation studies, or tissue classification the signal intensities are usually transformed and normalized in several steps in order to improve comparability and signal/noise ratio. These steps may include subtraction of an estimated background signal, subtracting the reference signal, smoothing ( to account for nonlinear measurement effects), and more. Different authors use different approaches, and it is generally not clear to users which method they should prefer. Results: We used the ratio between biological variance and measurement variance (which is an F-like statistic) as a quality measure for transformation methods, and we demonstrate a method for maximizing that variance ratio on real data. We explore a number of transformations issues, including Box-Cox transformation, baseline shift, partial subtraction of the log-reference signal and smoothing. It appears that the optimal choice of parameters for the transformation methods depends on the data. Further, the behavior of the variance ratio, under the null hypothesis of zero biological variance, appears to depend on the choice of parameters. Conclusions: The use of replicates in microarray experiments is important. Adjustment for the null-hypothesis behavior of the variance ratio is critical to the selection of transformation method

Original language	English
Pages (from-to)	77
Journal	BMC Bioinformatics
Volume	5
DOIs	https://doi.org/10.1186/1471-2105-5-77
Publication status	Published - 2004

Access to Document

https://doi.org/10.1186/1471-2105-5-77

Cite this

@article{2ede78806e6047839458b2dcbf0966e8,

title = "Comparing transformation methods for DNA microarray data",

abstract = "Background: When DNA microarray data are used for gene clustering, genotype/phenotype correlation studies, or tissue classification the signal intensities are usually transformed and normalized in several steps in order to improve comparability and signal/noise ratio. These steps may include subtraction of an estimated background signal, subtracting the reference signal, smoothing ( to account for nonlinear measurement effects), and more. Different authors use different approaches, and it is generally not clear to users which method they should prefer. Results: We used the ratio between biological variance and measurement variance (which is an F-like statistic) as a quality measure for transformation methods, and we demonstrate a method for maximizing that variance ratio on real data. We explore a number of transformations issues, including Box-Cox transformation, baseline shift, partial subtraction of the log-reference signal and smoothing. It appears that the optimal choice of parameters for the transformation methods depends on the data. Further, the behavior of the variance ratio, under the null hypothesis of zero biological variance, appears to depend on the choice of parameters. Conclusions: The use of replicates in microarray experiments is important. Adjustment for the null-hypothesis behavior of the variance ratio is critical to the selection of transformation method",

author = "Thygesen, {Helene H.} and Zwinderman, {Aeilko H.}",

year = "2004",

doi = "https://doi.org/10.1186/1471-2105-5-77",

language = "English",

volume = "5",

pages = "77",

journal = "BMC Bioinformatics",

issn = "1471-2105",

publisher = "BioMed Central",

}

TY - JOUR

T1 - Comparing transformation methods for DNA microarray data

AU - Thygesen, Helene H.

AU - Zwinderman, Aeilko H.

PY - 2004

Y1 - 2004

N2 - Background: When DNA microarray data are used for gene clustering, genotype/phenotype correlation studies, or tissue classification the signal intensities are usually transformed and normalized in several steps in order to improve comparability and signal/noise ratio. These steps may include subtraction of an estimated background signal, subtracting the reference signal, smoothing ( to account for nonlinear measurement effects), and more. Different authors use different approaches, and it is generally not clear to users which method they should prefer. Results: We used the ratio between biological variance and measurement variance (which is an F-like statistic) as a quality measure for transformation methods, and we demonstrate a method for maximizing that variance ratio on real data. We explore a number of transformations issues, including Box-Cox transformation, baseline shift, partial subtraction of the log-reference signal and smoothing. It appears that the optimal choice of parameters for the transformation methods depends on the data. Further, the behavior of the variance ratio, under the null hypothesis of zero biological variance, appears to depend on the choice of parameters. Conclusions: The use of replicates in microarray experiments is important. Adjustment for the null-hypothesis behavior of the variance ratio is critical to the selection of transformation method

AB - Background: When DNA microarray data are used for gene clustering, genotype/phenotype correlation studies, or tissue classification the signal intensities are usually transformed and normalized in several steps in order to improve comparability and signal/noise ratio. These steps may include subtraction of an estimated background signal, subtracting the reference signal, smoothing ( to account for nonlinear measurement effects), and more. Different authors use different approaches, and it is generally not clear to users which method they should prefer. Results: We used the ratio between biological variance and measurement variance (which is an F-like statistic) as a quality measure for transformation methods, and we demonstrate a method for maximizing that variance ratio on real data. We explore a number of transformations issues, including Box-Cox transformation, baseline shift, partial subtraction of the log-reference signal and smoothing. It appears that the optimal choice of parameters for the transformation methods depends on the data. Further, the behavior of the variance ratio, under the null hypothesis of zero biological variance, appears to depend on the choice of parameters. Conclusions: The use of replicates in microarray experiments is important. Adjustment for the null-hypothesis behavior of the variance ratio is critical to the selection of transformation method

U2 - https://doi.org/10.1186/1471-2105-5-77

DO - https://doi.org/10.1186/1471-2105-5-77

M3 - Article

C2 - 15202953

SN - 1471-2105

VL - 5

SP - 77

JO - BMC Bioinformatics

JF - BMC Bioinformatics

ER -