Matchmaking: A new MapReduce scheduling technique

Chen He, Ying Lu, David Swanson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

66 Citations (Scopus)

Abstract

MapReduce is a powerful platform for large-scale data processing. To achieve good performance, a MapReduce scheduler must avoid unnecessary data transmission by enhancing the data locality (placing tasks on nodes that contain their input data). This paper develops a new MapReduce scheduling technique to enhance map task's data locality. We have integrated this technique into Hadoop default FIFO scheduler and Hadoop fair scheduler. To evaluate our technique, we compare not only MapReduce scheduling algorithms with and without our technique but also with an existing data locality enhancement technique (i.e., the delay algorithm developed by Facebook). Experimental results show that our technique often leads to the highest data locality rate and the lowest response time for map tasks. Furthermore, unlike the delay algorithm, it does not require an intricate parameter tuning process.

Original languageEnglish (US)
Title of host publicationProceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011
Pages40-47
Number of pages8
DOIs
StatePublished - Dec 1 2011
Event2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011 - Athens, Greece
Duration: Nov 29 2011Dec 1 2011

Publication series

NameProceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011

Conference

Conference2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011
CountryGreece
CityAthens
Period11/29/1112/1/11

Fingerprint

Scheduling
Scheduling algorithms
Data communication systems
Tuning

Keywords

  • Data locality
  • Hadoop
  • MapReduce
  • Scheduling technique

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Networks and Communications

Cite this

He, C., Lu, Y., & Swanson, D. (2011). Matchmaking: A new MapReduce scheduling technique. In Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011 (pp. 40-47). [6133125] (Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011). https://doi.org/10.1109/CloudCom.2011.16

Matchmaking : A new MapReduce scheduling technique. / He, Chen; Lu, Ying; Swanson, David.

Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011. 2011. p. 40-47 6133125 (Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

He, C, Lu, Y & Swanson, D 2011, Matchmaking: A new MapReduce scheduling technique. in Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011., 6133125, Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011, pp. 40-47, 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011, Athens, Greece, 11/29/11. https://doi.org/10.1109/CloudCom.2011.16
He C, Lu Y, Swanson D. Matchmaking: A new MapReduce scheduling technique. In Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011. 2011. p. 40-47. 6133125. (Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011). https://doi.org/10.1109/CloudCom.2011.16
He, Chen ; Lu, Ying ; Swanson, David. / Matchmaking : A new MapReduce scheduling technique. Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011. 2011. pp. 40-47 (Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011).
@inproceedings{05eb4655e126418bacee74b35fba8e0f,
title = "Matchmaking: A new MapReduce scheduling technique",
abstract = "MapReduce is a powerful platform for large-scale data processing. To achieve good performance, a MapReduce scheduler must avoid unnecessary data transmission by enhancing the data locality (placing tasks on nodes that contain their input data). This paper develops a new MapReduce scheduling technique to enhance map task's data locality. We have integrated this technique into Hadoop default FIFO scheduler and Hadoop fair scheduler. To evaluate our technique, we compare not only MapReduce scheduling algorithms with and without our technique but also with an existing data locality enhancement technique (i.e., the delay algorithm developed by Facebook). Experimental results show that our technique often leads to the highest data locality rate and the lowest response time for map tasks. Furthermore, unlike the delay algorithm, it does not require an intricate parameter tuning process.",
keywords = "Data locality, Hadoop, MapReduce, Scheduling technique",
author = "Chen He and Ying Lu and David Swanson",
year = "2011",
month = "12",
day = "1",
doi = "10.1109/CloudCom.2011.16",
language = "English (US)",
isbn = "9780769546223",
series = "Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011",
pages = "40--47",
booktitle = "Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011",

}

TY - GEN

T1 - Matchmaking

T2 - A new MapReduce scheduling technique

AU - He, Chen

AU - Lu, Ying

AU - Swanson, David

PY - 2011/12/1

Y1 - 2011/12/1

N2 - MapReduce is a powerful platform for large-scale data processing. To achieve good performance, a MapReduce scheduler must avoid unnecessary data transmission by enhancing the data locality (placing tasks on nodes that contain their input data). This paper develops a new MapReduce scheduling technique to enhance map task's data locality. We have integrated this technique into Hadoop default FIFO scheduler and Hadoop fair scheduler. To evaluate our technique, we compare not only MapReduce scheduling algorithms with and without our technique but also with an existing data locality enhancement technique (i.e., the delay algorithm developed by Facebook). Experimental results show that our technique often leads to the highest data locality rate and the lowest response time for map tasks. Furthermore, unlike the delay algorithm, it does not require an intricate parameter tuning process.

AB - MapReduce is a powerful platform for large-scale data processing. To achieve good performance, a MapReduce scheduler must avoid unnecessary data transmission by enhancing the data locality (placing tasks on nodes that contain their input data). This paper develops a new MapReduce scheduling technique to enhance map task's data locality. We have integrated this technique into Hadoop default FIFO scheduler and Hadoop fair scheduler. To evaluate our technique, we compare not only MapReduce scheduling algorithms with and without our technique but also with an existing data locality enhancement technique (i.e., the delay algorithm developed by Facebook). Experimental results show that our technique often leads to the highest data locality rate and the lowest response time for map tasks. Furthermore, unlike the delay algorithm, it does not require an intricate parameter tuning process.

KW - Data locality

KW - Hadoop

KW - MapReduce

KW - Scheduling technique

UR - http://www.scopus.com/inward/record.url?scp=84863180724&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863180724&partnerID=8YFLogxK

U2 - 10.1109/CloudCom.2011.16

DO - 10.1109/CloudCom.2011.16

M3 - Conference contribution

AN - SCOPUS:84863180724

SN - 9780769546223

T3 - Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011

SP - 40

EP - 47

BT - Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011

ER -