Differential expression analysis in RNA-Seq by a naive bayes classifier with local normalization

Yongchao Dou, Xiaomei Guo, Lingling Yuan, David R. Holding, Chi Zhang

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

To improve the applicability of RNA-seq technology, a large number of RNA-seq data analysis methods and correction algorithms have been developed. Although these new methods and algorithms have steadily improved transcriptome analysis, greater prediction accuracy is needed to better guide experimental designs with computational results. In this study, a new tool for the identification of differentially expressed genes with RNA-seq data, named GExposer, was developed. This tool introduces a local normalization algorithm to reduce the bias of nonrandomly positioned read depth. The naive Bayes classifier is employed to integrate fold change, transcript length, and GC content to identify differentially expressed genes. Results on several independent tests show that GExposer has better performance than other methods. The combination of the local normalization algorithm and naive Bayes classifier with three attributes can achieve better results; both false positive rates and false negative rates are reduced. However, only a small portion of genes is affected by the local normalization and GC content correction.

Original languageEnglish (US)
Article number789516
JournalBioMed research international
Volume2015
DOIs
StatePublished - Jan 1 2015

Fingerprint

Classifiers
RNA
Genes
Base Composition
Gene Expression Profiling
Design of experiments
Research Design
Technology

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)

Cite this

Differential expression analysis in RNA-Seq by a naive bayes classifier with local normalization. / Dou, Yongchao; Guo, Xiaomei; Yuan, Lingling; Holding, David R.; Zhang, Chi.

In: BioMed research international, Vol. 2015, 789516, 01.01.2015.

Research output: Contribution to journalArticle

Dou, Yongchao ; Guo, Xiaomei ; Yuan, Lingling ; Holding, David R. ; Zhang, Chi. / Differential expression analysis in RNA-Seq by a naive bayes classifier with local normalization. In: BioMed research international. 2015 ; Vol. 2015.
@article{4cbf91d0056844ae9b72139782f62bac,
title = "Differential expression analysis in RNA-Seq by a naive bayes classifier with local normalization",
abstract = "To improve the applicability of RNA-seq technology, a large number of RNA-seq data analysis methods and correction algorithms have been developed. Although these new methods and algorithms have steadily improved transcriptome analysis, greater prediction accuracy is needed to better guide experimental designs with computational results. In this study, a new tool for the identification of differentially expressed genes with RNA-seq data, named GExposer, was developed. This tool introduces a local normalization algorithm to reduce the bias of nonrandomly positioned read depth. The naive Bayes classifier is employed to integrate fold change, transcript length, and GC content to identify differentially expressed genes. Results on several independent tests show that GExposer has better performance than other methods. The combination of the local normalization algorithm and naive Bayes classifier with three attributes can achieve better results; both false positive rates and false negative rates are reduced. However, only a small portion of genes is affected by the local normalization and GC content correction.",
author = "Yongchao Dou and Xiaomei Guo and Lingling Yuan and Holding, {David R.} and Chi Zhang",
year = "2015",
month = "1",
day = "1",
doi = "10.1155/2015/789516",
language = "English (US)",
volume = "2015",
journal = "BioMed Research International",
issn = "2314-6133",
publisher = "Hindawi Publishing Corporation",

}

TY - JOUR

T1 - Differential expression analysis in RNA-Seq by a naive bayes classifier with local normalization

AU - Dou, Yongchao

AU - Guo, Xiaomei

AU - Yuan, Lingling

AU - Holding, David R.

AU - Zhang, Chi

PY - 2015/1/1

Y1 - 2015/1/1

N2 - To improve the applicability of RNA-seq technology, a large number of RNA-seq data analysis methods and correction algorithms have been developed. Although these new methods and algorithms have steadily improved transcriptome analysis, greater prediction accuracy is needed to better guide experimental designs with computational results. In this study, a new tool for the identification of differentially expressed genes with RNA-seq data, named GExposer, was developed. This tool introduces a local normalization algorithm to reduce the bias of nonrandomly positioned read depth. The naive Bayes classifier is employed to integrate fold change, transcript length, and GC content to identify differentially expressed genes. Results on several independent tests show that GExposer has better performance than other methods. The combination of the local normalization algorithm and naive Bayes classifier with three attributes can achieve better results; both false positive rates and false negative rates are reduced. However, only a small portion of genes is affected by the local normalization and GC content correction.

AB - To improve the applicability of RNA-seq technology, a large number of RNA-seq data analysis methods and correction algorithms have been developed. Although these new methods and algorithms have steadily improved transcriptome analysis, greater prediction accuracy is needed to better guide experimental designs with computational results. In this study, a new tool for the identification of differentially expressed genes with RNA-seq data, named GExposer, was developed. This tool introduces a local normalization algorithm to reduce the bias of nonrandomly positioned read depth. The naive Bayes classifier is employed to integrate fold change, transcript length, and GC content to identify differentially expressed genes. Results on several independent tests show that GExposer has better performance than other methods. The combination of the local normalization algorithm and naive Bayes classifier with three attributes can achieve better results; both false positive rates and false negative rates are reduced. However, only a small portion of genes is affected by the local normalization and GC content correction.

UR - http://www.scopus.com/inward/record.url?scp=84939818220&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84939818220&partnerID=8YFLogxK

U2 - 10.1155/2015/789516

DO - 10.1155/2015/789516

M3 - Article

C2 - 26339642

AN - SCOPUS:84939818220

VL - 2015

JO - BioMed Research International

JF - BioMed Research International

SN - 2314-6133

M1 - 789516

ER -