A divide-and-conquer approach to fragment assembly

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

Motivation: One of the major problems in DNA sequencing is assembling the fragments obtained by shotgun sequencing. Most existing fragment assembly techniques follow the overlap-layout-consensus approach. This framework requires extensive computation in each phase and becomes inefficient with increasing number of fragments. Results: We propose a new algorithm which solves the overlap, layout, and consensus phases simultaneously. The fragments are clustered with respect to their Average Mutual Information (AMI) profiles using the k-means algorithm. This removes the unnecessary burden of considering the collection of fragments as a whole. Instead, the orientation and overlap detection are solved efficiently, within the clusters. The algorithm has successfully reconstructed both artificial and real data.

Original languageEnglish (US)
Pages (from-to)22-29
Number of pages8
JournalBioinformatics
Volume19
Issue number1
DOIs
StatePublished - Jan 1 2003

Fingerprint

Divide and conquer
Fragment
Overlap
Firearms
Layout
DNA Sequence Analysis
DNA
DNA Sequencing
K-means Algorithm
Mutual Information
Sequencing

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

A divide-and-conquer approach to fragment assembly. / Otu, Hasan H.; Sayood, Khalid.

In: Bioinformatics, Vol. 19, No. 1, 01.01.2003, p. 22-29.

Research output: Contribution to journalArticle

@article{3c446682a93c4ab1b88f8baad48b1417,
title = "A divide-and-conquer approach to fragment assembly",
abstract = "Motivation: One of the major problems in DNA sequencing is assembling the fragments obtained by shotgun sequencing. Most existing fragment assembly techniques follow the overlap-layout-consensus approach. This framework requires extensive computation in each phase and becomes inefficient with increasing number of fragments. Results: We propose a new algorithm which solves the overlap, layout, and consensus phases simultaneously. The fragments are clustered with respect to their Average Mutual Information (AMI) profiles using the k-means algorithm. This removes the unnecessary burden of considering the collection of fragments as a whole. Instead, the orientation and overlap detection are solved efficiently, within the clusters. The algorithm has successfully reconstructed both artificial and real data.",
author = "Otu, {Hasan H.} and Khalid Sayood",
year = "2003",
month = "1",
day = "1",
doi = "10.1093/bioinformatics/19.1.22",
language = "English (US)",
volume = "19",
pages = "22--29",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "1",

}

TY - JOUR

T1 - A divide-and-conquer approach to fragment assembly

AU - Otu, Hasan H.

AU - Sayood, Khalid

PY - 2003/1/1

Y1 - 2003/1/1

N2 - Motivation: One of the major problems in DNA sequencing is assembling the fragments obtained by shotgun sequencing. Most existing fragment assembly techniques follow the overlap-layout-consensus approach. This framework requires extensive computation in each phase and becomes inefficient with increasing number of fragments. Results: We propose a new algorithm which solves the overlap, layout, and consensus phases simultaneously. The fragments are clustered with respect to their Average Mutual Information (AMI) profiles using the k-means algorithm. This removes the unnecessary burden of considering the collection of fragments as a whole. Instead, the orientation and overlap detection are solved efficiently, within the clusters. The algorithm has successfully reconstructed both artificial and real data.

AB - Motivation: One of the major problems in DNA sequencing is assembling the fragments obtained by shotgun sequencing. Most existing fragment assembly techniques follow the overlap-layout-consensus approach. This framework requires extensive computation in each phase and becomes inefficient with increasing number of fragments. Results: We propose a new algorithm which solves the overlap, layout, and consensus phases simultaneously. The fragments are clustered with respect to their Average Mutual Information (AMI) profiles using the k-means algorithm. This removes the unnecessary burden of considering the collection of fragments as a whole. Instead, the orientation and overlap detection are solved efficiently, within the clusters. The algorithm has successfully reconstructed both artificial and real data.

UR - http://www.scopus.com/inward/record.url?scp=0037248694&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0037248694&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/19.1.22

DO - 10.1093/bioinformatics/19.1.22

M3 - Article

C2 - 12499289

AN - SCOPUS:0037248694

VL - 19

SP - 22

EP - 29

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 1

ER -