Focus: A new multilayer graph model for short read analysis and extraction of biologically relevant features

Julia Warnke, Hesham H Ali

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

With the increasing number of applications in which a group of organisms associated with a common environment are sequenced, there is an urgent need for a new model for representing the sequenced short reads in a way that takes the nature of these organisms into consideration. In addition to facilitating the assembly process, such new models should allow for easy extraction of other useful biological information from the short reads, including conserved regions among the input genomics, sequence motifs, and other information critical to the recognition and/or classification of the organisms. We present Focus, a new multilayer graph model for short read analysis and extraction of biologically relevant features. The proposed model can be viewed as a data-mining tool that takes advantage of the multilayer graph representation of the reads to extract useful information about the associated genomes/organisms. While not primarily an assembly tool, we assessed Focus using known assemblers with excellent results. We also applied Focus in a case study on a HIV read dataset and were able to successfully extract biologically relevant graph features.

Original languageEnglish (US)
Title of host publicationACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
PublisherAssociation for Computing Machinery, Inc
Pages489-498
Number of pages10
ISBN (Electronic)9781450328944
DOIs
StatePublished - Sep 20 2014
Event5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014 - Newport Beach, United States
Duration: Sep 20 2014Sep 23 2014

Publication series

NameACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Conference

Conference5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014
CountryUnited States
CityNewport Beach
Period9/20/149/23/14

Fingerprint

Data Mining
Genomics
Multilayers
HIV
Genome
Data mining
Genes
Datasets

Keywords

  • Data-mining
  • Graph modeling
  • Metagenomics
  • Next generation sequencing

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications
  • Software
  • Biomedical Engineering

Cite this

Warnke, J., & Ali, H. H. (2014). Focus: A new multilayer graph model for short read analysis and extraction of biologically relevant features. In ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (pp. 489-498). (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics). Association for Computing Machinery, Inc. https://doi.org/10.1145/2649387.2649434

Focus : A new multilayer graph model for short read analysis and extraction of biologically relevant features. / Warnke, Julia; Ali, Hesham H.

ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, Inc, 2014. p. 489-498 (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Warnke, J & Ali, HH 2014, Focus: A new multilayer graph model for short read analysis and extraction of biologically relevant features. in ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, Association for Computing Machinery, Inc, pp. 489-498, 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014, Newport Beach, United States, 9/20/14. https://doi.org/10.1145/2649387.2649434
Warnke J, Ali HH. Focus: A new multilayer graph model for short read analysis and extraction of biologically relevant features. In ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, Inc. 2014. p. 489-498. (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics). https://doi.org/10.1145/2649387.2649434
Warnke, Julia ; Ali, Hesham H. / Focus : A new multilayer graph model for short read analysis and extraction of biologically relevant features. ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, Inc, 2014. pp. 489-498 (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics).
@inproceedings{27bc4305c148426db4997b4ccdcda177,
title = "Focus: A new multilayer graph model for short read analysis and extraction of biologically relevant features",
abstract = "With the increasing number of applications in which a group of organisms associated with a common environment are sequenced, there is an urgent need for a new model for representing the sequenced short reads in a way that takes the nature of these organisms into consideration. In addition to facilitating the assembly process, such new models should allow for easy extraction of other useful biological information from the short reads, including conserved regions among the input genomics, sequence motifs, and other information critical to the recognition and/or classification of the organisms. We present Focus, a new multilayer graph model for short read analysis and extraction of biologically relevant features. The proposed model can be viewed as a data-mining tool that takes advantage of the multilayer graph representation of the reads to extract useful information about the associated genomes/organisms. While not primarily an assembly tool, we assessed Focus using known assemblers with excellent results. We also applied Focus in a case study on a HIV read dataset and were able to successfully extract biologically relevant graph features.",
keywords = "Data-mining, Graph modeling, Metagenomics, Next generation sequencing",
author = "Julia Warnke and Ali, {Hesham H}",
year = "2014",
month = "9",
day = "20",
doi = "10.1145/2649387.2649434",
language = "English (US)",
series = "ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics",
publisher = "Association for Computing Machinery, Inc",
pages = "489--498",
booktitle = "ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics",

}

TY - GEN

T1 - Focus

T2 - A new multilayer graph model for short read analysis and extraction of biologically relevant features

AU - Warnke, Julia

AU - Ali, Hesham H

PY - 2014/9/20

Y1 - 2014/9/20

N2 - With the increasing number of applications in which a group of organisms associated with a common environment are sequenced, there is an urgent need for a new model for representing the sequenced short reads in a way that takes the nature of these organisms into consideration. In addition to facilitating the assembly process, such new models should allow for easy extraction of other useful biological information from the short reads, including conserved regions among the input genomics, sequence motifs, and other information critical to the recognition and/or classification of the organisms. We present Focus, a new multilayer graph model for short read analysis and extraction of biologically relevant features. The proposed model can be viewed as a data-mining tool that takes advantage of the multilayer graph representation of the reads to extract useful information about the associated genomes/organisms. While not primarily an assembly tool, we assessed Focus using known assemblers with excellent results. We also applied Focus in a case study on a HIV read dataset and were able to successfully extract biologically relevant graph features.

AB - With the increasing number of applications in which a group of organisms associated with a common environment are sequenced, there is an urgent need for a new model for representing the sequenced short reads in a way that takes the nature of these organisms into consideration. In addition to facilitating the assembly process, such new models should allow for easy extraction of other useful biological information from the short reads, including conserved regions among the input genomics, sequence motifs, and other information critical to the recognition and/or classification of the organisms. We present Focus, a new multilayer graph model for short read analysis and extraction of biologically relevant features. The proposed model can be viewed as a data-mining tool that takes advantage of the multilayer graph representation of the reads to extract useful information about the associated genomes/organisms. While not primarily an assembly tool, we assessed Focus using known assemblers with excellent results. We also applied Focus in a case study on a HIV read dataset and were able to successfully extract biologically relevant graph features.

KW - Data-mining

KW - Graph modeling

KW - Metagenomics

KW - Next generation sequencing

UR - http://www.scopus.com/inward/record.url?scp=84920747722&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84920747722&partnerID=8YFLogxK

U2 - 10.1145/2649387.2649434

DO - 10.1145/2649387.2649434

M3 - Conference contribution

AN - SCOPUS:84920747722

T3 - ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

SP - 489

EP - 498

BT - ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

PB - Association for Computing Machinery, Inc

ER -