Omnibus testing and gene filtration in microarray data analysis

Hongying Dai, Richard Charnigo

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

When thousands of tests are performed simultaneously to detect differentially expressed genes in microarray analysis, the number of Type I errors can be immense if a multiplicity adjustment is not made. However, due to the large scale, traditional adjustment methods require very stringen significance levels for individual tests, which yield low power for detecting alterations. In this work, we describe how two omnibus tests can be used in conjunction with a gene filtration process to circumvent difficulties due to the large scale of testing. These two omnibus tests, the D-test and the modified likelihood ratio test (MLRT), can be used to investigate whether a collection of P-values has arisen from the Uniform(0,1) distribution or whether the Uniform(0,1) distribution contaminated by another Beta distribution is more appropriate. In the former case, attention can be directed to a smaller part of the genome; in the latter event, parameter estimates for the contamination model provide a frame of reference for multiple comparisons. Unlike the likelihood ratio test (LRT), both the D-test and MLRT enjoy simple limiting distributions under the null hypothesis of no contamination, so critical values can be obtained from standard tables. Simulation studies demonstrate that the D-test and MLRT are superior to the AIC, BIC, and Kolmogorov-Smirnov test. A case study illustrates omnibus testing and filtration.

Original languageEnglish (US)
Pages (from-to)31-47
Number of pages17
JournalJournal of Applied Statistics
Volume35
Issue number1
DOIs
StatePublished - Jan 1 2008

Fingerprint

Microarray Data Analysis
Filtration
Modified Likelihood
Likelihood Ratio Test
Gene
Testing
Omnibus Test
Contamination
Adjustment
Kolmogorov-Smirnov Test
Microarray Analysis
Multiple Comparisons
Beta distribution
Significance level
Type I error
Limiting Distribution
Null hypothesis
Tables
Critical value
Multiplicity

Keywords

  • Beta contamination model
  • D-test
  • MLRT
  • MMLEs
  • Multiple comparisons
  • P-values

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Omnibus testing and gene filtration in microarray data analysis. / Dai, Hongying; Charnigo, Richard.

In: Journal of Applied Statistics, Vol. 35, No. 1, 01.01.2008, p. 31-47.

Research output: Contribution to journalArticle

@article{a74f032e7c0143c99fb5819c472166f2,
title = "Omnibus testing and gene filtration in microarray data analysis",
abstract = "When thousands of tests are performed simultaneously to detect differentially expressed genes in microarray analysis, the number of Type I errors can be immense if a multiplicity adjustment is not made. However, due to the large scale, traditional adjustment methods require very stringen significance levels for individual tests, which yield low power for detecting alterations. In this work, we describe how two omnibus tests can be used in conjunction with a gene filtration process to circumvent difficulties due to the large scale of testing. These two omnibus tests, the D-test and the modified likelihood ratio test (MLRT), can be used to investigate whether a collection of P-values has arisen from the Uniform(0,1) distribution or whether the Uniform(0,1) distribution contaminated by another Beta distribution is more appropriate. In the former case, attention can be directed to a smaller part of the genome; in the latter event, parameter estimates for the contamination model provide a frame of reference for multiple comparisons. Unlike the likelihood ratio test (LRT), both the D-test and MLRT enjoy simple limiting distributions under the null hypothesis of no contamination, so critical values can be obtained from standard tables. Simulation studies demonstrate that the D-test and MLRT are superior to the AIC, BIC, and Kolmogorov-Smirnov test. A case study illustrates omnibus testing and filtration.",
keywords = "Beta contamination model, D-test, MLRT, MMLEs, Multiple comparisons, P-values",
author = "Hongying Dai and Richard Charnigo",
year = "2008",
month = "1",
day = "1",
doi = "10.1080/02664760701683528",
language = "English (US)",
volume = "35",
pages = "31--47",
journal = "Journal of Applied Statistics",
issn = "0266-4763",
publisher = "Routledge",
number = "1",

}

TY - JOUR

T1 - Omnibus testing and gene filtration in microarray data analysis

AU - Dai, Hongying

AU - Charnigo, Richard

PY - 2008/1/1

Y1 - 2008/1/1

N2 - When thousands of tests are performed simultaneously to detect differentially expressed genes in microarray analysis, the number of Type I errors can be immense if a multiplicity adjustment is not made. However, due to the large scale, traditional adjustment methods require very stringen significance levels for individual tests, which yield low power for detecting alterations. In this work, we describe how two omnibus tests can be used in conjunction with a gene filtration process to circumvent difficulties due to the large scale of testing. These two omnibus tests, the D-test and the modified likelihood ratio test (MLRT), can be used to investigate whether a collection of P-values has arisen from the Uniform(0,1) distribution or whether the Uniform(0,1) distribution contaminated by another Beta distribution is more appropriate. In the former case, attention can be directed to a smaller part of the genome; in the latter event, parameter estimates for the contamination model provide a frame of reference for multiple comparisons. Unlike the likelihood ratio test (LRT), both the D-test and MLRT enjoy simple limiting distributions under the null hypothesis of no contamination, so critical values can be obtained from standard tables. Simulation studies demonstrate that the D-test and MLRT are superior to the AIC, BIC, and Kolmogorov-Smirnov test. A case study illustrates omnibus testing and filtration.

AB - When thousands of tests are performed simultaneously to detect differentially expressed genes in microarray analysis, the number of Type I errors can be immense if a multiplicity adjustment is not made. However, due to the large scale, traditional adjustment methods require very stringen significance levels for individual tests, which yield low power for detecting alterations. In this work, we describe how two omnibus tests can be used in conjunction with a gene filtration process to circumvent difficulties due to the large scale of testing. These two omnibus tests, the D-test and the modified likelihood ratio test (MLRT), can be used to investigate whether a collection of P-values has arisen from the Uniform(0,1) distribution or whether the Uniform(0,1) distribution contaminated by another Beta distribution is more appropriate. In the former case, attention can be directed to a smaller part of the genome; in the latter event, parameter estimates for the contamination model provide a frame of reference for multiple comparisons. Unlike the likelihood ratio test (LRT), both the D-test and MLRT enjoy simple limiting distributions under the null hypothesis of no contamination, so critical values can be obtained from standard tables. Simulation studies demonstrate that the D-test and MLRT are superior to the AIC, BIC, and Kolmogorov-Smirnov test. A case study illustrates omnibus testing and filtration.

KW - Beta contamination model

KW - D-test

KW - MLRT

KW - MMLEs

KW - Multiple comparisons

KW - P-values

UR - http://www.scopus.com/inward/record.url?scp=38049098083&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=38049098083&partnerID=8YFLogxK

U2 - 10.1080/02664760701683528

DO - 10.1080/02664760701683528

M3 - Article

AN - SCOPUS:38049098083

VL - 35

SP - 31

EP - 47

JO - Journal of Applied Statistics

JF - Journal of Applied Statistics

SN - 0266-4763

IS - 1

ER -