Method of predicting Splice Sites based on signal interactions

Alexander Churbanov, Igor B. Rogozin, Jitender S Deogun, Hesham H Ali

Research output: Contribution to journalReview article

21 Citations (Scopus)

Abstract

Background: Predicting and proper ranking of canonical splice sites (SSs) is a challenging problem in bioinformatics and machine learning communities. Any progress in SSs recognition will lead to better understanding of splicing mechanism. We introduce several new approaches of combining a priori knowledge for improved SS detection. First, we design our new Bayesian SS sensor based on oligonucleotide counting. To further enhance prediction quality, we applied our new de novo motif detection tool MHMMotif to intronic ends and exons. We combine elements found with sensor information using Naive Bayesian Network, as implemented in our new tool SpliceScan. Results: According to our tests, the Bayesian sensor outperforms the contemporary Maximum Entropy sensor for 5' SS detection. We report a number of putative Exonic (ESE) and Intronic (ISE) Splicing Enhancers found by MHMMotif tool. T-test statistics on mouse/rat intronic alignments indicates, that detected elements are on average more conserved as compared to other oligos, which supports our assumption of their functional importance. The tool has been shown to outperform the SpliceView, GeneSplicer, NNSplice, Genio and NetUTR tools for the test set of human genes. SpliceScan outperforms all contemporary ab initio gene structural prediction tools on the set of 5′ UTR gene fragments. Conclusion: Designed methods have many attractive properties, compared to existing approaches. Bayesian sensor, MHMMotif program and SpliceScan tools are freely available on our web site.

Original languageEnglish (US)
Article number10
JournalBiology Direct
Volume1
DOIs
StatePublished - Apr 3 2006

Fingerprint

sensors (equipment)
sensor
Interaction
Genes
Sensor
RNA Splice Sites
Sensors
5' Untranslated Regions
Entropy
gene
Computational Biology
Oligonucleotides
Gene
Exons
methodology
bioinformatics
prediction
structural genes
artificial intelligence
5' untranslated regions

ASJC Scopus subject areas

  • Immunology
  • Ecology, Evolution, Behavior and Systematics
  • Modeling and Simulation
  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)
  • Applied Mathematics

Cite this

Method of predicting Splice Sites based on signal interactions. / Churbanov, Alexander; Rogozin, Igor B.; Deogun, Jitender S; Ali, Hesham H.

In: Biology Direct, Vol. 1, 10, 03.04.2006.

Research output: Contribution to journalReview article

@article{f7eeeeeb8d094e0a9b260c6ab83629a8,
title = "Method of predicting Splice Sites based on signal interactions",
abstract = "Background: Predicting and proper ranking of canonical splice sites (SSs) is a challenging problem in bioinformatics and machine learning communities. Any progress in SSs recognition will lead to better understanding of splicing mechanism. We introduce several new approaches of combining a priori knowledge for improved SS detection. First, we design our new Bayesian SS sensor based on oligonucleotide counting. To further enhance prediction quality, we applied our new de novo motif detection tool MHMMotif to intronic ends and exons. We combine elements found with sensor information using Naive Bayesian Network, as implemented in our new tool SpliceScan. Results: According to our tests, the Bayesian sensor outperforms the contemporary Maximum Entropy sensor for 5' SS detection. We report a number of putative Exonic (ESE) and Intronic (ISE) Splicing Enhancers found by MHMMotif tool. T-test statistics on mouse/rat intronic alignments indicates, that detected elements are on average more conserved as compared to other oligos, which supports our assumption of their functional importance. The tool has been shown to outperform the SpliceView, GeneSplicer, NNSplice, Genio and NetUTR tools for the test set of human genes. SpliceScan outperforms all contemporary ab initio gene structural prediction tools on the set of 5′ UTR gene fragments. Conclusion: Designed methods have many attractive properties, compared to existing approaches. Bayesian sensor, MHMMotif program and SpliceScan tools are freely available on our web site.",
author = "Alexander Churbanov and Rogozin, {Igor B.} and Deogun, {Jitender S} and Ali, {Hesham H}",
year = "2006",
month = "4",
day = "3",
doi = "10.1186/1745-6150-1-10",
language = "English (US)",
volume = "1",
journal = "Biology Direct",
issn = "1745-6150",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Method of predicting Splice Sites based on signal interactions

AU - Churbanov, Alexander

AU - Rogozin, Igor B.

AU - Deogun, Jitender S

AU - Ali, Hesham H

PY - 2006/4/3

Y1 - 2006/4/3

N2 - Background: Predicting and proper ranking of canonical splice sites (SSs) is a challenging problem in bioinformatics and machine learning communities. Any progress in SSs recognition will lead to better understanding of splicing mechanism. We introduce several new approaches of combining a priori knowledge for improved SS detection. First, we design our new Bayesian SS sensor based on oligonucleotide counting. To further enhance prediction quality, we applied our new de novo motif detection tool MHMMotif to intronic ends and exons. We combine elements found with sensor information using Naive Bayesian Network, as implemented in our new tool SpliceScan. Results: According to our tests, the Bayesian sensor outperforms the contemporary Maximum Entropy sensor for 5' SS detection. We report a number of putative Exonic (ESE) and Intronic (ISE) Splicing Enhancers found by MHMMotif tool. T-test statistics on mouse/rat intronic alignments indicates, that detected elements are on average more conserved as compared to other oligos, which supports our assumption of their functional importance. The tool has been shown to outperform the SpliceView, GeneSplicer, NNSplice, Genio and NetUTR tools for the test set of human genes. SpliceScan outperforms all contemporary ab initio gene structural prediction tools on the set of 5′ UTR gene fragments. Conclusion: Designed methods have many attractive properties, compared to existing approaches. Bayesian sensor, MHMMotif program and SpliceScan tools are freely available on our web site.

AB - Background: Predicting and proper ranking of canonical splice sites (SSs) is a challenging problem in bioinformatics and machine learning communities. Any progress in SSs recognition will lead to better understanding of splicing mechanism. We introduce several new approaches of combining a priori knowledge for improved SS detection. First, we design our new Bayesian SS sensor based on oligonucleotide counting. To further enhance prediction quality, we applied our new de novo motif detection tool MHMMotif to intronic ends and exons. We combine elements found with sensor information using Naive Bayesian Network, as implemented in our new tool SpliceScan. Results: According to our tests, the Bayesian sensor outperforms the contemporary Maximum Entropy sensor for 5' SS detection. We report a number of putative Exonic (ESE) and Intronic (ISE) Splicing Enhancers found by MHMMotif tool. T-test statistics on mouse/rat intronic alignments indicates, that detected elements are on average more conserved as compared to other oligos, which supports our assumption of their functional importance. The tool has been shown to outperform the SpliceView, GeneSplicer, NNSplice, Genio and NetUTR tools for the test set of human genes. SpliceScan outperforms all contemporary ab initio gene structural prediction tools on the set of 5′ UTR gene fragments. Conclusion: Designed methods have many attractive properties, compared to existing approaches. Bayesian sensor, MHMMotif program and SpliceScan tools are freely available on our web site.

UR - http://www.scopus.com/inward/record.url?scp=33749989756&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33749989756&partnerID=8YFLogxK

U2 - 10.1186/1745-6150-1-10

DO - 10.1186/1745-6150-1-10

M3 - Review article

VL - 1

JO - Biology Direct

JF - Biology Direct

SN - 1745-6150

M1 - 10

ER -