On the tradeoff between speedup and energy consumption in high performance computing - A bioinformatics case study

Sachin Pawaskar, Hesham H Ali

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

High Performance Computing has been very useful to researchers in the Bioinformatics, Medical and related fields. The bioinformatics domain is rich in applications that require extracting useful information from very large and continuously growing sequence of databases. Automated techniques such as DNA sequencers, DNA microarrays & others are continually growing the dataset that is stored in large public databases such as GenBank and Protein DataBank. Most methods used for analyzing genetic/protein data have been found to be extremely computationally intensive, providing motivation for the use of powerful computers or systems with high throughput characteristics. In this paper, we provide a case study for one such bioinformatics application called BLAT running in a high performance computing environment. We use sequences gathered from researchers and parallelize the runs to study the performance characteristics under three different query and data partitioning models. This research highlights the need to carefully develop a parallel model with energy awareness in mind, based on our understanding of the application and then appropriately designing a parallel model that works well for the specific application and domain. We found that the BLAT program is highly parallelizable and a high degree of speedup is achievable. The experiments suggest that the speed up depends on model used for query and database segmentation.

Original languageEnglish (US)
Title of host publicationProceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010
Pages218-225
Number of pages8
StatePublished - Jul 20 2010
Event9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010 - Innsbruck, Austria
Duration: Feb 16 2010Feb 18 2010

Publication series

NameProceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010

Conference

Conference9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010
CountryAustria
CityInnsbruck
Period2/16/102/18/10

Fingerprint

Bioinformatics
Energy utilization
DNA
Proteins
Microarrays
Throughput
Experiments

Keywords

  • Bioinformatics
  • Energy awareness
  • High performance computing
  • Parallel processing
  • Sequence comparisons

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Software

Cite this

Pawaskar, S., & Ali, H. H. (2010). On the tradeoff between speedup and energy consumption in high performance computing - A bioinformatics case study. In Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010 (pp. 218-225). (Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010).

On the tradeoff between speedup and energy consumption in high performance computing - A bioinformatics case study. / Pawaskar, Sachin; Ali, Hesham H.

Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010. 2010. p. 218-225 (Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Pawaskar, S & Ali, HH 2010, On the tradeoff between speedup and energy consumption in high performance computing - A bioinformatics case study. in Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010. Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010, pp. 218-225, 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010, Innsbruck, Austria, 2/16/10.
Pawaskar S, Ali HH. On the tradeoff between speedup and energy consumption in high performance computing - A bioinformatics case study. In Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010. 2010. p. 218-225. (Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010).
Pawaskar, Sachin ; Ali, Hesham H. / On the tradeoff between speedup and energy consumption in high performance computing - A bioinformatics case study. Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010. 2010. pp. 218-225 (Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010).
@inproceedings{face7e8e100345aeb4041d934c27ac0d,
title = "On the tradeoff between speedup and energy consumption in high performance computing - A bioinformatics case study",
abstract = "High Performance Computing has been very useful to researchers in the Bioinformatics, Medical and related fields. The bioinformatics domain is rich in applications that require extracting useful information from very large and continuously growing sequence of databases. Automated techniques such as DNA sequencers, DNA microarrays & others are continually growing the dataset that is stored in large public databases such as GenBank and Protein DataBank. Most methods used for analyzing genetic/protein data have been found to be extremely computationally intensive, providing motivation for the use of powerful computers or systems with high throughput characteristics. In this paper, we provide a case study for one such bioinformatics application called BLAT running in a high performance computing environment. We use sequences gathered from researchers and parallelize the runs to study the performance characteristics under three different query and data partitioning models. This research highlights the need to carefully develop a parallel model with energy awareness in mind, based on our understanding of the application and then appropriately designing a parallel model that works well for the specific application and domain. We found that the BLAT program is highly parallelizable and a high degree of speedup is achievable. The experiments suggest that the speed up depends on model used for query and database segmentation.",
keywords = "Bioinformatics, Energy awareness, High performance computing, Parallel processing, Sequence comparisons",
author = "Sachin Pawaskar and Ali, {Hesham H}",
year = "2010",
month = "7",
day = "20",
language = "English (US)",
isbn = "9780889868205",
series = "Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010",
pages = "218--225",
booktitle = "Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010",

}

TY - GEN

T1 - On the tradeoff between speedup and energy consumption in high performance computing - A bioinformatics case study

AU - Pawaskar, Sachin

AU - Ali, Hesham H

PY - 2010/7/20

Y1 - 2010/7/20

N2 - High Performance Computing has been very useful to researchers in the Bioinformatics, Medical and related fields. The bioinformatics domain is rich in applications that require extracting useful information from very large and continuously growing sequence of databases. Automated techniques such as DNA sequencers, DNA microarrays & others are continually growing the dataset that is stored in large public databases such as GenBank and Protein DataBank. Most methods used for analyzing genetic/protein data have been found to be extremely computationally intensive, providing motivation for the use of powerful computers or systems with high throughput characteristics. In this paper, we provide a case study for one such bioinformatics application called BLAT running in a high performance computing environment. We use sequences gathered from researchers and parallelize the runs to study the performance characteristics under three different query and data partitioning models. This research highlights the need to carefully develop a parallel model with energy awareness in mind, based on our understanding of the application and then appropriately designing a parallel model that works well for the specific application and domain. We found that the BLAT program is highly parallelizable and a high degree of speedup is achievable. The experiments suggest that the speed up depends on model used for query and database segmentation.

AB - High Performance Computing has been very useful to researchers in the Bioinformatics, Medical and related fields. The bioinformatics domain is rich in applications that require extracting useful information from very large and continuously growing sequence of databases. Automated techniques such as DNA sequencers, DNA microarrays & others are continually growing the dataset that is stored in large public databases such as GenBank and Protein DataBank. Most methods used for analyzing genetic/protein data have been found to be extremely computationally intensive, providing motivation for the use of powerful computers or systems with high throughput characteristics. In this paper, we provide a case study for one such bioinformatics application called BLAT running in a high performance computing environment. We use sequences gathered from researchers and parallelize the runs to study the performance characteristics under three different query and data partitioning models. This research highlights the need to carefully develop a parallel model with energy awareness in mind, based on our understanding of the application and then appropriately designing a parallel model that works well for the specific application and domain. We found that the BLAT program is highly parallelizable and a high degree of speedup is achievable. The experiments suggest that the speed up depends on model used for query and database segmentation.

KW - Bioinformatics

KW - Energy awareness

KW - High performance computing

KW - Parallel processing

KW - Sequence comparisons

UR - http://www.scopus.com/inward/record.url?scp=77954606465&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77954606465&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:77954606465

SN - 9780889868205

T3 - Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010

SP - 218

EP - 225

BT - Proceedings of the 9th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2010

ER -