A genome signature based on Markov modeling

Jian Li, Khalid Sayood

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

We propose a "genome signature" for bacterial genomes based on a triplets Markov model. Without the alignment or data preprocessing required by traditional analysis methods, the model is shown to efficiently capture identifying genomic information at genus, species and strain levels. Based on the model, a simple distance measure is proposed for constructing phylogeny trees. Unlike other genome signatures based on word frequency with problems balancing word length and window size, the method has been shown to work successfully with both bacterial whole genome data and individual eukaryotic genes. Applications of the model to phylogenetic analysis and sequence fragment identification are presented.

Original languageEnglish (US)
Title of host publicationProceedings of the 2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005
Pages2832-2835
Number of pages4
StatePublished - Dec 1 2005
Event2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005 - Shanghai, China
Duration: Sep 1 2005Sep 4 2005

Publication series

NameAnnual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings
Volume7 VOLS
ISSN (Print)0589-1019

Conference

Conference2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005
CountryChina
CityShanghai
Period9/1/059/4/05

Fingerprint

Genes
Identification (control systems)

ASJC Scopus subject areas

  • Signal Processing
  • Biomedical Engineering
  • Computer Vision and Pattern Recognition
  • Health Informatics

Cite this

Li, J., & Sayood, K. (2005). A genome signature based on Markov modeling. In Proceedings of the 2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005 (pp. 2832-2835). [1617063] (Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings; Vol. 7 VOLS).

A genome signature based on Markov modeling. / Li, Jian; Sayood, Khalid.

Proceedings of the 2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005. 2005. p. 2832-2835 1617063 (Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings; Vol. 7 VOLS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, J & Sayood, K 2005, A genome signature based on Markov modeling. in Proceedings of the 2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005., 1617063, Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings, vol. 7 VOLS, pp. 2832-2835, 2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005, Shanghai, China, 9/1/05.
Li J, Sayood K. A genome signature based on Markov modeling. In Proceedings of the 2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005. 2005. p. 2832-2835. 1617063. (Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings).
Li, Jian ; Sayood, Khalid. / A genome signature based on Markov modeling. Proceedings of the 2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005. 2005. pp. 2832-2835 (Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings).
@inproceedings{3d3c22763e5042bbac974daa1ceefd8d,
title = "A genome signature based on Markov modeling",
abstract = "We propose a {"}genome signature{"} for bacterial genomes based on a triplets Markov model. Without the alignment or data preprocessing required by traditional analysis methods, the model is shown to efficiently capture identifying genomic information at genus, species and strain levels. Based on the model, a simple distance measure is proposed for constructing phylogeny trees. Unlike other genome signatures based on word frequency with problems balancing word length and window size, the method has been shown to work successfully with both bacterial whole genome data and individual eukaryotic genes. Applications of the model to phylogenetic analysis and sequence fragment identification are presented.",
author = "Jian Li and Khalid Sayood",
year = "2005",
month = "12",
day = "1",
language = "English (US)",
isbn = "0780387406",
series = "Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings",
pages = "2832--2835",
booktitle = "Proceedings of the 2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005",

}

TY - GEN

T1 - A genome signature based on Markov modeling

AU - Li, Jian

AU - Sayood, Khalid

PY - 2005/12/1

Y1 - 2005/12/1

N2 - We propose a "genome signature" for bacterial genomes based on a triplets Markov model. Without the alignment or data preprocessing required by traditional analysis methods, the model is shown to efficiently capture identifying genomic information at genus, species and strain levels. Based on the model, a simple distance measure is proposed for constructing phylogeny trees. Unlike other genome signatures based on word frequency with problems balancing word length and window size, the method has been shown to work successfully with both bacterial whole genome data and individual eukaryotic genes. Applications of the model to phylogenetic analysis and sequence fragment identification are presented.

AB - We propose a "genome signature" for bacterial genomes based on a triplets Markov model. Without the alignment or data preprocessing required by traditional analysis methods, the model is shown to efficiently capture identifying genomic information at genus, species and strain levels. Based on the model, a simple distance measure is proposed for constructing phylogeny trees. Unlike other genome signatures based on word frequency with problems balancing word length and window size, the method has been shown to work successfully with both bacterial whole genome data and individual eukaryotic genes. Applications of the model to phylogenetic analysis and sequence fragment identification are presented.

UR - http://www.scopus.com/inward/record.url?scp=33846926591&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33846926591&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:33846926591

SN - 0780387406

SN - 9780780387409

T3 - Annual International Conference of the IEEE Engineering in Medicine and Biology - Proceedings

SP - 2832

EP - 2835

BT - Proceedings of the 2005 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005

ER -