The development of parallel adaptive sampling algorithms for analyzing biological networks

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

The availability of biological data in massive scales continues to represent unlimited opportunities as well as great challenges in bioinformatics research. Developing innovative data mining techniques and efficient parallel computational methods to implement them will be crucial in extracting useful knowledge from this raw unprocessed data, such as in discovering significant cellular subsystems from gene correlation networks. In this paper, we present a scalable combinatorial sampling technique, based on identifying maximum chordal sub graphs, that reduces noise from biological correlation networks, thereby making it possible to find biologically relevant clusters from the filtered network. We show how selecting the appropriate filter is crucial in maintaining the key structures from the original networks and uncovering new ones after removing noisy relationships. We also conduct one of the first comparisons in two important sensitivity criteria - the perturbation due to the vertex numbers of the network and perturbations due to data distribution. We demonstrate that our chordal-graph based filter is effective across many different vertex permutations, as is our parallel implementation of the sampling algorithm.

Original languageEnglish (US)
Title of host publicationProceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
Pages725-734
Number of pages10
DOIs
StatePublished - Oct 18 2012
Event2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012 - Shanghai, China
Duration: May 21 2012May 25 2012

Publication series

NameProceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012

Conference

Conference2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
CountryChina
CityShanghai
Period5/21/125/25/12

Fingerprint

Sampling
Bioinformatics
Computational methods
Data mining
Genes
Availability

Keywords

  • chordal graphs
  • cluster overlap
  • correlation networks
  • edge enrichment
  • ordering

ASJC Scopus subject areas

  • Software

Cite this

Cooper, K. M., Duraisamy, K., Bhowmick, S., & Ali, H. H. (2012). The development of parallel adaptive sampling algorithms for analyzing biological networks. In Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012 (pp. 725-734). [6270712] (Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012). https://doi.org/10.1109/IPDPSW.2012.90

The development of parallel adaptive sampling algorithms for analyzing biological networks. / Cooper, Kathryn M; Duraisamy, Kanimathi; Bhowmick, Sanjukta; Ali, Hesham H.

Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012. 2012. p. 725-734 6270712 (Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cooper, KM, Duraisamy, K, Bhowmick, S & Ali, HH 2012, The development of parallel adaptive sampling algorithms for analyzing biological networks. in Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012., 6270712, Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012, pp. 725-734, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012, Shanghai, China, 5/21/12. https://doi.org/10.1109/IPDPSW.2012.90
Cooper KM, Duraisamy K, Bhowmick S, Ali HH. The development of parallel adaptive sampling algorithms for analyzing biological networks. In Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012. 2012. p. 725-734. 6270712. (Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012). https://doi.org/10.1109/IPDPSW.2012.90
Cooper, Kathryn M ; Duraisamy, Kanimathi ; Bhowmick, Sanjukta ; Ali, Hesham H. / The development of parallel adaptive sampling algorithms for analyzing biological networks. Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012. 2012. pp. 725-734 (Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012).
@inproceedings{7ad7fb68f55141169456204a31075341,
title = "The development of parallel adaptive sampling algorithms for analyzing biological networks",
abstract = "The availability of biological data in massive scales continues to represent unlimited opportunities as well as great challenges in bioinformatics research. Developing innovative data mining techniques and efficient parallel computational methods to implement them will be crucial in extracting useful knowledge from this raw unprocessed data, such as in discovering significant cellular subsystems from gene correlation networks. In this paper, we present a scalable combinatorial sampling technique, based on identifying maximum chordal sub graphs, that reduces noise from biological correlation networks, thereby making it possible to find biologically relevant clusters from the filtered network. We show how selecting the appropriate filter is crucial in maintaining the key structures from the original networks and uncovering new ones after removing noisy relationships. We also conduct one of the first comparisons in two important sensitivity criteria - the perturbation due to the vertex numbers of the network and perturbations due to data distribution. We demonstrate that our chordal-graph based filter is effective across many different vertex permutations, as is our parallel implementation of the sampling algorithm.",
keywords = "chordal graphs, cluster overlap, correlation networks, edge enrichment, ordering",
author = "Cooper, {Kathryn M} and Kanimathi Duraisamy and Sanjukta Bhowmick and Ali, {Hesham H}",
year = "2012",
month = "10",
day = "18",
doi = "10.1109/IPDPSW.2012.90",
language = "English (US)",
isbn = "9780769546766",
series = "Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012",
pages = "725--734",
booktitle = "Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012",

}

TY - GEN

T1 - The development of parallel adaptive sampling algorithms for analyzing biological networks

AU - Cooper, Kathryn M

AU - Duraisamy, Kanimathi

AU - Bhowmick, Sanjukta

AU - Ali, Hesham H

PY - 2012/10/18

Y1 - 2012/10/18

N2 - The availability of biological data in massive scales continues to represent unlimited opportunities as well as great challenges in bioinformatics research. Developing innovative data mining techniques and efficient parallel computational methods to implement them will be crucial in extracting useful knowledge from this raw unprocessed data, such as in discovering significant cellular subsystems from gene correlation networks. In this paper, we present a scalable combinatorial sampling technique, based on identifying maximum chordal sub graphs, that reduces noise from biological correlation networks, thereby making it possible to find biologically relevant clusters from the filtered network. We show how selecting the appropriate filter is crucial in maintaining the key structures from the original networks and uncovering new ones after removing noisy relationships. We also conduct one of the first comparisons in two important sensitivity criteria - the perturbation due to the vertex numbers of the network and perturbations due to data distribution. We demonstrate that our chordal-graph based filter is effective across many different vertex permutations, as is our parallel implementation of the sampling algorithm.

AB - The availability of biological data in massive scales continues to represent unlimited opportunities as well as great challenges in bioinformatics research. Developing innovative data mining techniques and efficient parallel computational methods to implement them will be crucial in extracting useful knowledge from this raw unprocessed data, such as in discovering significant cellular subsystems from gene correlation networks. In this paper, we present a scalable combinatorial sampling technique, based on identifying maximum chordal sub graphs, that reduces noise from biological correlation networks, thereby making it possible to find biologically relevant clusters from the filtered network. We show how selecting the appropriate filter is crucial in maintaining the key structures from the original networks and uncovering new ones after removing noisy relationships. We also conduct one of the first comparisons in two important sensitivity criteria - the perturbation due to the vertex numbers of the network and perturbations due to data distribution. We demonstrate that our chordal-graph based filter is effective across many different vertex permutations, as is our parallel implementation of the sampling algorithm.

KW - chordal graphs

KW - cluster overlap

KW - correlation networks

KW - edge enrichment

KW - ordering

UR - http://www.scopus.com/inward/record.url?scp=84867415020&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867415020&partnerID=8YFLogxK

U2 - 10.1109/IPDPSW.2012.90

DO - 10.1109/IPDPSW.2012.90

M3 - Conference contribution

SN - 9780769546766

T3 - Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012

SP - 725

EP - 734

BT - Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012

ER -