Implications of data placement strategy to Big Data technologies based on shared-nothing architecture for geosciences

Kwo Sen Kuo, Amidu Oloso, Khoa Doan, Thomas L. Clune, Hongfeng Yu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

It is found that data placement on the networked nodes of a cluster based on the shared-nothing architecture (SNA) should align in the physical (i.e. spatiotemporal) space for most geoscience Big Data analysis systems in order to minimize data movements and thus achieve optimal performance and efficiency. This is due to the fact that data analysis in geosciences predominantly requires spatiotemporal coincidence. If individual datasets are considered separately in their placement on the cluster nodes, these systems often have to move data between nodes when an analysis involves two or more datasets. In this paper, we first report our discoveries from a data placement alignment experiment with two Big Data technologies, SciDB and Spark+HDFS, and then elucidate some of the far-reaching implications of this discovery.

Original languageEnglish (US)
Title of host publication2016 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages7605-7607
Number of pages3
ISBN (Electronic)9781509033324
DOIs
StatePublished - Nov 1 2016
Event36th IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016 - Beijing, China
Duration: Jul 10 2016Jul 15 2016

Publication series

NameInternational Geoscience and Remote Sensing Symposium (IGARSS)
Volume2016-November

Other

Other36th IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016
CountryChina
CityBeijing
Period7/10/167/15/16

Fingerprint

Electric sparks
Experiments
Big data
experiment
data analysis
analysis
alignment

Keywords

  • Big Data
  • data placement
  • geoscience
  • shared-nothing architecture

ASJC Scopus subject areas

  • Computer Science Applications
  • Earth and Planetary Sciences(all)

Cite this

Kuo, K. S., Oloso, A., Doan, K., Clune, T. L., & Yu, H. (2016). Implications of data placement strategy to Big Data technologies based on shared-nothing architecture for geosciences. In 2016 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016 - Proceedings (pp. 7605-7607). [7730983] (International Geoscience and Remote Sensing Symposium (IGARSS); Vol. 2016-November). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IGARSS.2016.7730983

Implications of data placement strategy to Big Data technologies based on shared-nothing architecture for geosciences. / Kuo, Kwo Sen; Oloso, Amidu; Doan, Khoa; Clune, Thomas L.; Yu, Hongfeng.

2016 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2016. p. 7605-7607 7730983 (International Geoscience and Remote Sensing Symposium (IGARSS); Vol. 2016-November).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kuo, KS, Oloso, A, Doan, K, Clune, TL & Yu, H 2016, Implications of data placement strategy to Big Data technologies based on shared-nothing architecture for geosciences. in 2016 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016 - Proceedings., 7730983, International Geoscience and Remote Sensing Symposium (IGARSS), vol. 2016-November, Institute of Electrical and Electronics Engineers Inc., pp. 7605-7607, 36th IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016, Beijing, China, 7/10/16. https://doi.org/10.1109/IGARSS.2016.7730983
Kuo KS, Oloso A, Doan K, Clune TL, Yu H. Implications of data placement strategy to Big Data technologies based on shared-nothing architecture for geosciences. In 2016 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2016. p. 7605-7607. 7730983. (International Geoscience and Remote Sensing Symposium (IGARSS)). https://doi.org/10.1109/IGARSS.2016.7730983
Kuo, Kwo Sen ; Oloso, Amidu ; Doan, Khoa ; Clune, Thomas L. ; Yu, Hongfeng. / Implications of data placement strategy to Big Data technologies based on shared-nothing architecture for geosciences. 2016 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 7605-7607 (International Geoscience and Remote Sensing Symposium (IGARSS)).
@inproceedings{99a9ab2caa5e4677996b937b20738009,
title = "Implications of data placement strategy to Big Data technologies based on shared-nothing architecture for geosciences",
abstract = "It is found that data placement on the networked nodes of a cluster based on the shared-nothing architecture (SNA) should align in the physical (i.e. spatiotemporal) space for most geoscience Big Data analysis systems in order to minimize data movements and thus achieve optimal performance and efficiency. This is due to the fact that data analysis in geosciences predominantly requires spatiotemporal coincidence. If individual datasets are considered separately in their placement on the cluster nodes, these systems often have to move data between nodes when an analysis involves two or more datasets. In this paper, we first report our discoveries from a data placement alignment experiment with two Big Data technologies, SciDB and Spark+HDFS, and then elucidate some of the far-reaching implications of this discovery.",
keywords = "Big Data, data placement, geoscience, shared-nothing architecture",
author = "Kuo, {Kwo Sen} and Amidu Oloso and Khoa Doan and Clune, {Thomas L.} and Hongfeng Yu",
year = "2016",
month = "11",
day = "1",
doi = "10.1109/IGARSS.2016.7730983",
language = "English (US)",
series = "International Geoscience and Remote Sensing Symposium (IGARSS)",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "7605--7607",
booktitle = "2016 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016 - Proceedings",

}

TY - GEN

T1 - Implications of data placement strategy to Big Data technologies based on shared-nothing architecture for geosciences

AU - Kuo, Kwo Sen

AU - Oloso, Amidu

AU - Doan, Khoa

AU - Clune, Thomas L.

AU - Yu, Hongfeng

PY - 2016/11/1

Y1 - 2016/11/1

N2 - It is found that data placement on the networked nodes of a cluster based on the shared-nothing architecture (SNA) should align in the physical (i.e. spatiotemporal) space for most geoscience Big Data analysis systems in order to minimize data movements and thus achieve optimal performance and efficiency. This is due to the fact that data analysis in geosciences predominantly requires spatiotemporal coincidence. If individual datasets are considered separately in their placement on the cluster nodes, these systems often have to move data between nodes when an analysis involves two or more datasets. In this paper, we first report our discoveries from a data placement alignment experiment with two Big Data technologies, SciDB and Spark+HDFS, and then elucidate some of the far-reaching implications of this discovery.

AB - It is found that data placement on the networked nodes of a cluster based on the shared-nothing architecture (SNA) should align in the physical (i.e. spatiotemporal) space for most geoscience Big Data analysis systems in order to minimize data movements and thus achieve optimal performance and efficiency. This is due to the fact that data analysis in geosciences predominantly requires spatiotemporal coincidence. If individual datasets are considered separately in their placement on the cluster nodes, these systems often have to move data between nodes when an analysis involves two or more datasets. In this paper, we first report our discoveries from a data placement alignment experiment with two Big Data technologies, SciDB and Spark+HDFS, and then elucidate some of the far-reaching implications of this discovery.

KW - Big Data

KW - data placement

KW - geoscience

KW - shared-nothing architecture

UR - http://www.scopus.com/inward/record.url?scp=85007470548&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85007470548&partnerID=8YFLogxK

U2 - 10.1109/IGARSS.2016.7730983

DO - 10.1109/IGARSS.2016.7730983

M3 - Conference contribution

AN - SCOPUS:85007470548

T3 - International Geoscience and Remote Sensing Symposium (IGARSS)

SP - 7605

EP - 7607

BT - 2016 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -