Bayesian pathway analysis of cancer microarray data

Melike Korucuoglu, Senol Isci, Arzucan Ozgur, Hasan H Otu

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

High Throughput Biological Data (HTBD) requires detailed analysis methods and from a life science perspective, these analysis results make most sense when interpreted within the context of biological pathways. Bayesian Networks (BNs) capture both linear and nonlinear interactions and handle stochastic events in a probabilistic framework accounting for noise making them viable candidates for HTBD analysis. We have recently proposed an approach, called Bayesian Pathway Analysis (BPA), for analyzing HTBD using BNs in which known biological pathways are modeled as BNs and pathways that best explain the given HTBD are found. BPA uses the fold change information to obtain an input matrix to score each pathway modeled as a BN. Scoring is achieved using the Bayesian-Dirichlet Equivalent method and significance is assessed by randomization via bootstrapping of the columns of the input matrix. In this study, we improve on the BPA system by optimizing the steps involved in "Data Preprocessing and Discretization", "Scoring", "Significance Assessment", and "Software and Web Application". We tested the improved system on synthetic data sets and achieved over 98% accuracy in identifying the active pathways. The overall approach was applied on real cancer microarray data sets in order to investigate the pathways that are commonly active in different cancer types. We compared our findings on the real data sets with a relevant approach called the Signaling Pathway Impact Analysis (SPIA).

Original languageEnglish (US)
Article numbere102803
JournalPloS one
Volume9
Issue number7
DOIs
StatePublished - Jul 18 2014

Fingerprint

Bayes Theorem
Bayesian networks
Microarrays
Throughput
neoplasms
Neoplasms
data analysis
Biological Science Disciplines
Random Allocation
Noise
Software
methodology
Datasets

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

Bayesian pathway analysis of cancer microarray data. / Korucuoglu, Melike; Isci, Senol; Ozgur, Arzucan; Otu, Hasan H.

In: PloS one, Vol. 9, No. 7, e102803, 18.07.2014.

Research output: Contribution to journalArticle

Korucuoglu, Melike ; Isci, Senol ; Ozgur, Arzucan ; Otu, Hasan H. / Bayesian pathway analysis of cancer microarray data. In: PloS one. 2014 ; Vol. 9, No. 7.
@article{936e1b683f1840d9aae32db81ce46159,
title = "Bayesian pathway analysis of cancer microarray data",
abstract = "High Throughput Biological Data (HTBD) requires detailed analysis methods and from a life science perspective, these analysis results make most sense when interpreted within the context of biological pathways. Bayesian Networks (BNs) capture both linear and nonlinear interactions and handle stochastic events in a probabilistic framework accounting for noise making them viable candidates for HTBD analysis. We have recently proposed an approach, called Bayesian Pathway Analysis (BPA), for analyzing HTBD using BNs in which known biological pathways are modeled as BNs and pathways that best explain the given HTBD are found. BPA uses the fold change information to obtain an input matrix to score each pathway modeled as a BN. Scoring is achieved using the Bayesian-Dirichlet Equivalent method and significance is assessed by randomization via bootstrapping of the columns of the input matrix. In this study, we improve on the BPA system by optimizing the steps involved in {"}Data Preprocessing and Discretization{"}, {"}Scoring{"}, {"}Significance Assessment{"}, and {"}Software and Web Application{"}. We tested the improved system on synthetic data sets and achieved over 98{\%} accuracy in identifying the active pathways. The overall approach was applied on real cancer microarray data sets in order to investigate the pathways that are commonly active in different cancer types. We compared our findings on the real data sets with a relevant approach called the Signaling Pathway Impact Analysis (SPIA).",
author = "Melike Korucuoglu and Senol Isci and Arzucan Ozgur and Otu, {Hasan H}",
year = "2014",
month = "7",
day = "18",
doi = "10.1371/journal.pone.0102803",
language = "English (US)",
volume = "9",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "7",

}

TY - JOUR

T1 - Bayesian pathway analysis of cancer microarray data

AU - Korucuoglu, Melike

AU - Isci, Senol

AU - Ozgur, Arzucan

AU - Otu, Hasan H

PY - 2014/7/18

Y1 - 2014/7/18

N2 - High Throughput Biological Data (HTBD) requires detailed analysis methods and from a life science perspective, these analysis results make most sense when interpreted within the context of biological pathways. Bayesian Networks (BNs) capture both linear and nonlinear interactions and handle stochastic events in a probabilistic framework accounting for noise making them viable candidates for HTBD analysis. We have recently proposed an approach, called Bayesian Pathway Analysis (BPA), for analyzing HTBD using BNs in which known biological pathways are modeled as BNs and pathways that best explain the given HTBD are found. BPA uses the fold change information to obtain an input matrix to score each pathway modeled as a BN. Scoring is achieved using the Bayesian-Dirichlet Equivalent method and significance is assessed by randomization via bootstrapping of the columns of the input matrix. In this study, we improve on the BPA system by optimizing the steps involved in "Data Preprocessing and Discretization", "Scoring", "Significance Assessment", and "Software and Web Application". We tested the improved system on synthetic data sets and achieved over 98% accuracy in identifying the active pathways. The overall approach was applied on real cancer microarray data sets in order to investigate the pathways that are commonly active in different cancer types. We compared our findings on the real data sets with a relevant approach called the Signaling Pathway Impact Analysis (SPIA).

AB - High Throughput Biological Data (HTBD) requires detailed analysis methods and from a life science perspective, these analysis results make most sense when interpreted within the context of biological pathways. Bayesian Networks (BNs) capture both linear and nonlinear interactions and handle stochastic events in a probabilistic framework accounting for noise making them viable candidates for HTBD analysis. We have recently proposed an approach, called Bayesian Pathway Analysis (BPA), for analyzing HTBD using BNs in which known biological pathways are modeled as BNs and pathways that best explain the given HTBD are found. BPA uses the fold change information to obtain an input matrix to score each pathway modeled as a BN. Scoring is achieved using the Bayesian-Dirichlet Equivalent method and significance is assessed by randomization via bootstrapping of the columns of the input matrix. In this study, we improve on the BPA system by optimizing the steps involved in "Data Preprocessing and Discretization", "Scoring", "Significance Assessment", and "Software and Web Application". We tested the improved system on synthetic data sets and achieved over 98% accuracy in identifying the active pathways. The overall approach was applied on real cancer microarray data sets in order to investigate the pathways that are commonly active in different cancer types. We compared our findings on the real data sets with a relevant approach called the Signaling Pathway Impact Analysis (SPIA).

UR - http://www.scopus.com/inward/record.url?scp=84904479805&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904479805&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0102803

DO - 10.1371/journal.pone.0102803

M3 - Article

VL - 9

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 7

M1 - e102803

ER -