Real-time scheduling in MapReduce clusters

Chen He, Ying Lu, David R Swanson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

MapReduce has been widely used as a Big Data processing platform. As it gets popular, its scheduling becomes increasingly important. In particular, since many MapReduce applications require real-time data processing, scheduling real time applications in MapReduce environments has become a significant problem. In this paper, we create a novel real-time scheduler for MapReduce, which overcomes the deficiencies of an existing scheduler. It avoids accepting jobs that will lead to deadline misses and improves the cluster utilization. We implement our scheduler in Hadoop system and experimental results show that our scheduler provides deadline guarantees for accepted jobs and achieves good cluster utilization.

Original languageEnglish (US)
Title of host publicationProceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013
PublisherIEEE Computer Society
Pages1536-1544
Number of pages9
ISBN (Print)9780769550886
DOIs
StatePublished - Jan 1 2014
Event15th IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 11th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, EUC 2013 - Zhangjiajie, Hunan, China
Duration: Nov 13 2013Nov 15 2013

Publication series

NameProceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013

Conference

Conference15th IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 11th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, EUC 2013
CountryChina
CityZhangjiajie, Hunan
Period11/13/1311/15/13

Fingerprint

Scheduling
Big data

Keywords

  • MapReduce
  • cluster utilization
  • real-time scheduling

ASJC Scopus subject areas

  • Software

Cite this

He, C., Lu, Y., & Swanson, D. R. (2014). Real-time scheduling in MapReduce clusters. In Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013 (pp. 1536-1544). [6832098] (Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013). IEEE Computer Society. https://doi.org/10.1109/HPCC.and.EUC.2013.216

Real-time scheduling in MapReduce clusters. / He, Chen; Lu, Ying; Swanson, David R.

Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013. IEEE Computer Society, 2014. p. 1536-1544 6832098 (Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

He, C, Lu, Y & Swanson, DR 2014, Real-time scheduling in MapReduce clusters. in Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013., 6832098, Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013, IEEE Computer Society, pp. 1536-1544, 15th IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 11th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, EUC 2013, Zhangjiajie, Hunan, China, 11/13/13. https://doi.org/10.1109/HPCC.and.EUC.2013.216
He C, Lu Y, Swanson DR. Real-time scheduling in MapReduce clusters. In Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013. IEEE Computer Society. 2014. p. 1536-1544. 6832098. (Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013). https://doi.org/10.1109/HPCC.and.EUC.2013.216
He, Chen ; Lu, Ying ; Swanson, David R. / Real-time scheduling in MapReduce clusters. Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013. IEEE Computer Society, 2014. pp. 1536-1544 (Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013).
@inproceedings{228a440843214e708b4bb4092dac40b3,
title = "Real-time scheduling in MapReduce clusters",
abstract = "MapReduce has been widely used as a Big Data processing platform. As it gets popular, its scheduling becomes increasingly important. In particular, since many MapReduce applications require real-time data processing, scheduling real time applications in MapReduce environments has become a significant problem. In this paper, we create a novel real-time scheduler for MapReduce, which overcomes the deficiencies of an existing scheduler. It avoids accepting jobs that will lead to deadline misses and improves the cluster utilization. We implement our scheduler in Hadoop system and experimental results show that our scheduler provides deadline guarantees for accepted jobs and achieves good cluster utilization.",
keywords = "MapReduce, cluster utilization, real-time scheduling",
author = "Chen He and Ying Lu and Swanson, {David R}",
year = "2014",
month = "1",
day = "1",
doi = "10.1109/HPCC.and.EUC.2013.216",
language = "English (US)",
isbn = "9780769550886",
series = "Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013",
publisher = "IEEE Computer Society",
pages = "1536--1544",
booktitle = "Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013",

}

TY - GEN

T1 - Real-time scheduling in MapReduce clusters

AU - He, Chen

AU - Lu, Ying

AU - Swanson, David R

PY - 2014/1/1

Y1 - 2014/1/1

N2 - MapReduce has been widely used as a Big Data processing platform. As it gets popular, its scheduling becomes increasingly important. In particular, since many MapReduce applications require real-time data processing, scheduling real time applications in MapReduce environments has become a significant problem. In this paper, we create a novel real-time scheduler for MapReduce, which overcomes the deficiencies of an existing scheduler. It avoids accepting jobs that will lead to deadline misses and improves the cluster utilization. We implement our scheduler in Hadoop system and experimental results show that our scheduler provides deadline guarantees for accepted jobs and achieves good cluster utilization.

AB - MapReduce has been widely used as a Big Data processing platform. As it gets popular, its scheduling becomes increasingly important. In particular, since many MapReduce applications require real-time data processing, scheduling real time applications in MapReduce environments has become a significant problem. In this paper, we create a novel real-time scheduler for MapReduce, which overcomes the deficiencies of an existing scheduler. It avoids accepting jobs that will lead to deadline misses and improves the cluster utilization. We implement our scheduler in Hadoop system and experimental results show that our scheduler provides deadline guarantees for accepted jobs and achieves good cluster utilization.

KW - MapReduce

KW - cluster utilization

KW - real-time scheduling

UR - http://www.scopus.com/inward/record.url?scp=84903973687&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84903973687&partnerID=8YFLogxK

U2 - 10.1109/HPCC.and.EUC.2013.216

DO - 10.1109/HPCC.and.EUC.2013.216

M3 - Conference contribution

SN - 9780769550886

T3 - Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013

SP - 1536

EP - 1544

BT - Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013

PB - IEEE Computer Society

ER -