A genome signature based on Markov modeling

Jian Li, Khalid Sayood

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a "genome signature" for bacterial genomes based on a triplets Markov model. Without the alignment or data preprocessing required by traditional analysis methods, the model is shown to efficiently capture identifying genomic information at both species and strain levels. Based on the model, a simple assumption-free distance measure is proposed for constructing phytogeny trees. The approach avoids problems with word frequency approaches such as balancing word length and window size. The method is shown to work successfully with both bacterial whole genome data and individual eukaryotic genes. Application of the model to phylogenetic analysis is presented.

Original languageEnglish (US)
Title of host publication2005 IEEE International Conference on Electro Information Technology
StatePublished - Dec 1 2005
Event2005 IEEE International Conference on Electro Information Technology - Lincoln, NE, United States
Duration: May 22 2005May 25 2005

Publication series

Name2005 IEEE International Conference on Electro Information Technology
Volume2005

Conference

Conference2005 IEEE International Conference on Electro Information Technology
CountryUnited States
CityLincoln, NE
Period5/22/055/25/05

Fingerprint

Genes

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Li, J., & Sayood, K. (2005). A genome signature based on Markov modeling. In 2005 IEEE International Conference on Electro Information Technology [1627006] (2005 IEEE International Conference on Electro Information Technology; Vol. 2005).

A genome signature based on Markov modeling. / Li, Jian; Sayood, Khalid.

2005 IEEE International Conference on Electro Information Technology. 2005. 1627006 (2005 IEEE International Conference on Electro Information Technology; Vol. 2005).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, J & Sayood, K 2005, A genome signature based on Markov modeling. in 2005 IEEE International Conference on Electro Information Technology., 1627006, 2005 IEEE International Conference on Electro Information Technology, vol. 2005, 2005 IEEE International Conference on Electro Information Technology, Lincoln, NE, United States, 5/22/05.
Li J, Sayood K. A genome signature based on Markov modeling. In 2005 IEEE International Conference on Electro Information Technology. 2005. 1627006. (2005 IEEE International Conference on Electro Information Technology).
Li, Jian ; Sayood, Khalid. / A genome signature based on Markov modeling. 2005 IEEE International Conference on Electro Information Technology. 2005. (2005 IEEE International Conference on Electro Information Technology).
@inproceedings{25c348f914904bc08b993097cc7057ce,
title = "A genome signature based on Markov modeling",
abstract = "We propose a {"}genome signature{"} for bacterial genomes based on a triplets Markov model. Without the alignment or data preprocessing required by traditional analysis methods, the model is shown to efficiently capture identifying genomic information at both species and strain levels. Based on the model, a simple assumption-free distance measure is proposed for constructing phytogeny trees. The approach avoids problems with word frequency approaches such as balancing word length and window size. The method is shown to work successfully with both bacterial whole genome data and individual eukaryotic genes. Application of the model to phylogenetic analysis is presented.",
author = "Jian Li and Khalid Sayood",
year = "2005",
month = "12",
day = "1",
language = "English (US)",
isbn = "0780392329",
series = "2005 IEEE International Conference on Electro Information Technology",
booktitle = "2005 IEEE International Conference on Electro Information Technology",

}

TY - GEN

T1 - A genome signature based on Markov modeling

AU - Li, Jian

AU - Sayood, Khalid

PY - 2005/12/1

Y1 - 2005/12/1

N2 - We propose a "genome signature" for bacterial genomes based on a triplets Markov model. Without the alignment or data preprocessing required by traditional analysis methods, the model is shown to efficiently capture identifying genomic information at both species and strain levels. Based on the model, a simple assumption-free distance measure is proposed for constructing phytogeny trees. The approach avoids problems with word frequency approaches such as balancing word length and window size. The method is shown to work successfully with both bacterial whole genome data and individual eukaryotic genes. Application of the model to phylogenetic analysis is presented.

AB - We propose a "genome signature" for bacterial genomes based on a triplets Markov model. Without the alignment or data preprocessing required by traditional analysis methods, the model is shown to efficiently capture identifying genomic information at both species and strain levels. Based on the model, a simple assumption-free distance measure is proposed for constructing phytogeny trees. The approach avoids problems with word frequency approaches such as balancing word length and window size. The method is shown to work successfully with both bacterial whole genome data and individual eukaryotic genes. Application of the model to phylogenetic analysis is presented.

UR - http://www.scopus.com/inward/record.url?scp=33947132099&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33947132099&partnerID=8YFLogxK

M3 - Conference contribution

SN - 0780392329

SN - 9780780392328

T3 - 2005 IEEE International Conference on Electro Information Technology

BT - 2005 IEEE International Conference on Electro Information Technology

ER -