A parallel non-alignment based approach to efficient sequence comparison using longest common subsequences

S. Bhowmick, M. Shafiullah, H. Rai, D. Bastola

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Biological sequence comparison programs have revolutionized the practice of biochemistry, and molecular and evolutionary biology. Pairwise comparison of genomic sequences is a popular method of choice for analyzing genetic sequence data. However the quality of results from most sequence comparison methods are significantly affected by small perturbations in the data and furthermore, there is a dearth of computational tools to compare sequences beyond a certain length. In this paper, we describe a parallel algorithm for comparing genetic sequences using an alignment free-method based on computing the Longest Common Subsequence (LCS) between genetic sequences. We validate the quality of our results by comparing the phylogenetic tress obtained from ClustalW and LCS. We also show through complexity analysis of the isoefficiency and by empirical measurement of the running time that our algorithm is very scalable.

Original languageEnglish (US)
Article number012012
JournalJournal of Physics: Conference Series
Volume256
Issue number1
DOIs
StatePublished - Jan 1 2010

Fingerprint

molecular biology
biochemistry
biology
alignment
perturbation

ASJC Scopus subject areas

  • Physics and Astronomy(all)

Cite this

A parallel non-alignment based approach to efficient sequence comparison using longest common subsequences. / Bhowmick, S.; Shafiullah, M.; Rai, H.; Bastola, D.

In: Journal of Physics: Conference Series, Vol. 256, No. 1, 012012, 01.01.2010.

Research output: Contribution to journalArticle

@article{fd81aa27578d4759a70e58b87c741a38,
title = "A parallel non-alignment based approach to efficient sequence comparison using longest common subsequences",
abstract = "Biological sequence comparison programs have revolutionized the practice of biochemistry, and molecular and evolutionary biology. Pairwise comparison of genomic sequences is a popular method of choice for analyzing genetic sequence data. However the quality of results from most sequence comparison methods are significantly affected by small perturbations in the data and furthermore, there is a dearth of computational tools to compare sequences beyond a certain length. In this paper, we describe a parallel algorithm for comparing genetic sequences using an alignment free-method based on computing the Longest Common Subsequence (LCS) between genetic sequences. We validate the quality of our results by comparing the phylogenetic tress obtained from ClustalW and LCS. We also show through complexity analysis of the isoefficiency and by empirical measurement of the running time that our algorithm is very scalable.",
author = "S. Bhowmick and M. Shafiullah and H. Rai and D. Bastola",
year = "2010",
month = "1",
day = "1",
doi = "10.1088/1742-6596/256/1/012012",
language = "English (US)",
volume = "256",
journal = "Journal of Physics: Conference Series",
issn = "1742-6588",
publisher = "IOP Publishing Ltd.",
number = "1",

}

TY - JOUR

T1 - A parallel non-alignment based approach to efficient sequence comparison using longest common subsequences

AU - Bhowmick, S.

AU - Shafiullah, M.

AU - Rai, H.

AU - Bastola, D.

PY - 2010/1/1

Y1 - 2010/1/1

N2 - Biological sequence comparison programs have revolutionized the practice of biochemistry, and molecular and evolutionary biology. Pairwise comparison of genomic sequences is a popular method of choice for analyzing genetic sequence data. However the quality of results from most sequence comparison methods are significantly affected by small perturbations in the data and furthermore, there is a dearth of computational tools to compare sequences beyond a certain length. In this paper, we describe a parallel algorithm for comparing genetic sequences using an alignment free-method based on computing the Longest Common Subsequence (LCS) between genetic sequences. We validate the quality of our results by comparing the phylogenetic tress obtained from ClustalW and LCS. We also show through complexity analysis of the isoefficiency and by empirical measurement of the running time that our algorithm is very scalable.

AB - Biological sequence comparison programs have revolutionized the practice of biochemistry, and molecular and evolutionary biology. Pairwise comparison of genomic sequences is a popular method of choice for analyzing genetic sequence data. However the quality of results from most sequence comparison methods are significantly affected by small perturbations in the data and furthermore, there is a dearth of computational tools to compare sequences beyond a certain length. In this paper, we describe a parallel algorithm for comparing genetic sequences using an alignment free-method based on computing the Longest Common Subsequence (LCS) between genetic sequences. We validate the quality of our results by comparing the phylogenetic tress obtained from ClustalW and LCS. We also show through complexity analysis of the isoefficiency and by empirical measurement of the running time that our algorithm is very scalable.

UR - http://www.scopus.com/inward/record.url?scp=79952397362&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952397362&partnerID=8YFLogxK

U2 - 10.1088/1742-6596/256/1/012012

DO - 10.1088/1742-6596/256/1/012012

M3 - Article

AN - SCOPUS:79952397362

VL - 256

JO - Journal of Physics: Conference Series

JF - Journal of Physics: Conference Series

SN - 1742-6588

IS - 1

M1 - 012012

ER -