On correcting the overestimation of the permutation-based false discovery rate estimator

Shuo Jiao, Shunpu Zhang

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Motivation: Recent attempts to account for multiple testing in the analysis of microarray data have focused on controlling the false discovery rate (FDR), which is defined as the expected percentage of the number of false positive genes among the claimed significant genes. As a consequence, the accuracy of the FDR estimators will be important for correctly controlling FDR. Xie et al. found that the standard permutation method of estimating FDR is biased and proposed to delete the predicted differentially expressed (DE) genes in the estimation of FDR for one-sample comparison. However, we notice that the formula of the FDR used in their paper is incorrect. This makes the comparison results reported in their paper unconvincing. Other problems with their method include the biased estimation of FDR caused by over- or under-deletion of DE genes in the estimation of FDR and by the implicit use of an unreasonable estimator of the true proportion of equivalently expressed (EE) genes. Due to the great importance of accurate FDR estimation in microarray data analysis, it is necessary to point out such problems and propose improved methods. Results: Our results confirm that the standard permutation method overestimates the FDR. With the correct FDR formula, we show the method of Xie et al. always gives biased estimation of FDR: it overestimates when the number of claimed significant genes is small, and underestimates when the number of claimed significant genes is large. To overcome these problems, we propose two modifications. The simulation results show that our estimator gives more accurate estimation.

Original languageEnglish (US)
Pages (from-to)1655-1661
Number of pages7
JournalBioinformatics
Volume24
Issue number15
DOIs
StatePublished - Aug 1 2008

Fingerprint

Permutation
Genes
Estimator
Gene
Microarray Analysis
Microarrays
Biased Estimation
Deletion
False
Microarray Data Analysis
Multiple Testing
Comparison Result
Microarray Data
False Positive
Testing
Biased
Percentage
Proportion
Necessary

ASJC Scopus subject areas

  • Clinical Biochemistry
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this

On correcting the overestimation of the permutation-based false discovery rate estimator. / Jiao, Shuo; Zhang, Shunpu.

In: Bioinformatics, Vol. 24, No. 15, 01.08.2008, p. 1655-1661.

Research output: Contribution to journalArticle

Jiao, Shuo ; Zhang, Shunpu. / On correcting the overestimation of the permutation-based false discovery rate estimator. In: Bioinformatics. 2008 ; Vol. 24, No. 15. pp. 1655-1661.
@article{475ee0be5edc4bd2b6b424a4dd8fe23d,
title = "On correcting the overestimation of the permutation-based false discovery rate estimator",
abstract = "Motivation: Recent attempts to account for multiple testing in the analysis of microarray data have focused on controlling the false discovery rate (FDR), which is defined as the expected percentage of the number of false positive genes among the claimed significant genes. As a consequence, the accuracy of the FDR estimators will be important for correctly controlling FDR. Xie et al. found that the standard permutation method of estimating FDR is biased and proposed to delete the predicted differentially expressed (DE) genes in the estimation of FDR for one-sample comparison. However, we notice that the formula of the FDR used in their paper is incorrect. This makes the comparison results reported in their paper unconvincing. Other problems with their method include the biased estimation of FDR caused by over- or under-deletion of DE genes in the estimation of FDR and by the implicit use of an unreasonable estimator of the true proportion of equivalently expressed (EE) genes. Due to the great importance of accurate FDR estimation in microarray data analysis, it is necessary to point out such problems and propose improved methods. Results: Our results confirm that the standard permutation method overestimates the FDR. With the correct FDR formula, we show the method of Xie et al. always gives biased estimation of FDR: it overestimates when the number of claimed significant genes is small, and underestimates when the number of claimed significant genes is large. To overcome these problems, we propose two modifications. The simulation results show that our estimator gives more accurate estimation.",
author = "Shuo Jiao and Shunpu Zhang",
year = "2008",
month = "8",
day = "1",
doi = "10.1093/bioinformatics/btn310",
language = "English (US)",
volume = "24",
pages = "1655--1661",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "15",

}

TY - JOUR

T1 - On correcting the overestimation of the permutation-based false discovery rate estimator

AU - Jiao, Shuo

AU - Zhang, Shunpu

PY - 2008/8/1

Y1 - 2008/8/1

N2 - Motivation: Recent attempts to account for multiple testing in the analysis of microarray data have focused on controlling the false discovery rate (FDR), which is defined as the expected percentage of the number of false positive genes among the claimed significant genes. As a consequence, the accuracy of the FDR estimators will be important for correctly controlling FDR. Xie et al. found that the standard permutation method of estimating FDR is biased and proposed to delete the predicted differentially expressed (DE) genes in the estimation of FDR for one-sample comparison. However, we notice that the formula of the FDR used in their paper is incorrect. This makes the comparison results reported in their paper unconvincing. Other problems with their method include the biased estimation of FDR caused by over- or under-deletion of DE genes in the estimation of FDR and by the implicit use of an unreasonable estimator of the true proportion of equivalently expressed (EE) genes. Due to the great importance of accurate FDR estimation in microarray data analysis, it is necessary to point out such problems and propose improved methods. Results: Our results confirm that the standard permutation method overestimates the FDR. With the correct FDR formula, we show the method of Xie et al. always gives biased estimation of FDR: it overestimates when the number of claimed significant genes is small, and underestimates when the number of claimed significant genes is large. To overcome these problems, we propose two modifications. The simulation results show that our estimator gives more accurate estimation.

AB - Motivation: Recent attempts to account for multiple testing in the analysis of microarray data have focused on controlling the false discovery rate (FDR), which is defined as the expected percentage of the number of false positive genes among the claimed significant genes. As a consequence, the accuracy of the FDR estimators will be important for correctly controlling FDR. Xie et al. found that the standard permutation method of estimating FDR is biased and proposed to delete the predicted differentially expressed (DE) genes in the estimation of FDR for one-sample comparison. However, we notice that the formula of the FDR used in their paper is incorrect. This makes the comparison results reported in their paper unconvincing. Other problems with their method include the biased estimation of FDR caused by over- or under-deletion of DE genes in the estimation of FDR and by the implicit use of an unreasonable estimator of the true proportion of equivalently expressed (EE) genes. Due to the great importance of accurate FDR estimation in microarray data analysis, it is necessary to point out such problems and propose improved methods. Results: Our results confirm that the standard permutation method overestimates the FDR. With the correct FDR formula, we show the method of Xie et al. always gives biased estimation of FDR: it overestimates when the number of claimed significant genes is small, and underestimates when the number of claimed significant genes is large. To overcome these problems, we propose two modifications. The simulation results show that our estimator gives more accurate estimation.

UR - http://www.scopus.com/inward/record.url?scp=48249109894&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=48249109894&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btn310

DO - 10.1093/bioinformatics/btn310

M3 - Article

VL - 24

SP - 1655

EP - 1661

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 15

ER -