Holistic analysis of multi-source, multi-feature data: Modeling and computation challenges

Abhishek Santra, Sanjukta Bhowmick

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

As a result of our increased ability to collect data from different sources, many real-world datasets are increasingly becoming multi-featured and these features can also be of different types. Examples of such multi-feature data include different modes of interactions among people (Facebook, Twitter, LinkedIn,..) or traffic accidents associated with diverse factors (speed, light conditions, weather,..). Efficiently modeling and analyzing these complex datasets to obtain actionable knowledge presents several challenges. Traditional approaches, such as using single layer networks (or monoplexes) may not be sufficient or appropriate for modeling and computation scalability. Recently, multiplexes have been proposed for the elegant handling of such data. In this position paper, we elaborate on different types of multiplexes (homogeneous, heterogeneous and hybrid) for modeling different types of data. The benefits of this modeling in terms of ease, understanding, and usage are highlighted. However, this model brings with it a new set of challenges for its analysis. The bulk of the paper discusses these challenges and the advantages of using this approach. With the right tools, both computation and storage can be reduced in addition to accommodating scalability.

Original languageEnglish (US)
Title of host publicationBig Data Analytics - 5th International Conference, BDA 2017, Proceedings
EditorsAshish Sureka, Sharma Chakravarthy, P. Krishna Reddy, Subhash Bhalla
PublisherSpringer Verlag
Pages59-68
Number of pages10
ISBN (Print)9783319724126
DOIs
StatePublished - Jan 1 2017
Event5th International Conference on Big Data Analytics, BDA 2017 - Hyderabad, India
Duration: Dec 12 2017Dec 15 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10721 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other5th International Conference on Big Data Analytics, BDA 2017
CountryIndia
CityHyderabad
Period12/12/1712/15/17

Fingerprint

Feature Modeling
Data Modeling
Data structures
Scalability
Highway accidents
Network layers
Modeling
Accidents
Weather
Traffic
Sufficient
Interaction

Keywords

  • Aggregation functions
  • Big data analytics
  • Graph analysis and query processing
  • Lossless composability
  • Multi-source, disparate data
  • Multiplex

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Santra, A., & Bhowmick, S. (2017). Holistic analysis of multi-source, multi-feature data: Modeling and computation challenges. In A. Sureka, S. Chakravarthy, P. K. Reddy, & S. Bhalla (Eds.), Big Data Analytics - 5th International Conference, BDA 2017, Proceedings (pp. 59-68). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10721 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-72413-3_4

Holistic analysis of multi-source, multi-feature data : Modeling and computation challenges. / Santra, Abhishek; Bhowmick, Sanjukta.

Big Data Analytics - 5th International Conference, BDA 2017, Proceedings. ed. / Ashish Sureka; Sharma Chakravarthy; P. Krishna Reddy; Subhash Bhalla. Springer Verlag, 2017. p. 59-68 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10721 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Santra, A & Bhowmick, S 2017, Holistic analysis of multi-source, multi-feature data: Modeling and computation challenges. in A Sureka, S Chakravarthy, PK Reddy & S Bhalla (eds), Big Data Analytics - 5th International Conference, BDA 2017, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10721 LNCS, Springer Verlag, pp. 59-68, 5th International Conference on Big Data Analytics, BDA 2017, Hyderabad, India, 12/12/17. https://doi.org/10.1007/978-3-319-72413-3_4
Santra A, Bhowmick S. Holistic analysis of multi-source, multi-feature data: Modeling and computation challenges. In Sureka A, Chakravarthy S, Reddy PK, Bhalla S, editors, Big Data Analytics - 5th International Conference, BDA 2017, Proceedings. Springer Verlag. 2017. p. 59-68. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-72413-3_4
Santra, Abhishek ; Bhowmick, Sanjukta. / Holistic analysis of multi-source, multi-feature data : Modeling and computation challenges. Big Data Analytics - 5th International Conference, BDA 2017, Proceedings. editor / Ashish Sureka ; Sharma Chakravarthy ; P. Krishna Reddy ; Subhash Bhalla. Springer Verlag, 2017. pp. 59-68 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{ebd18f6f33e247d48b3f38ac31f97000,
title = "Holistic analysis of multi-source, multi-feature data: Modeling and computation challenges",
abstract = "As a result of our increased ability to collect data from different sources, many real-world datasets are increasingly becoming multi-featured and these features can also be of different types. Examples of such multi-feature data include different modes of interactions among people (Facebook, Twitter, LinkedIn,..) or traffic accidents associated with diverse factors (speed, light conditions, weather,..). Efficiently modeling and analyzing these complex datasets to obtain actionable knowledge presents several challenges. Traditional approaches, such as using single layer networks (or monoplexes) may not be sufficient or appropriate for modeling and computation scalability. Recently, multiplexes have been proposed for the elegant handling of such data. In this position paper, we elaborate on different types of multiplexes (homogeneous, heterogeneous and hybrid) for modeling different types of data. The benefits of this modeling in terms of ease, understanding, and usage are highlighted. However, this model brings with it a new set of challenges for its analysis. The bulk of the paper discusses these challenges and the advantages of using this approach. With the right tools, both computation and storage can be reduced in addition to accommodating scalability.",
keywords = "Aggregation functions, Big data analytics, Graph analysis and query processing, Lossless composability, Multi-source, disparate data, Multiplex",
author = "Abhishek Santra and Sanjukta Bhowmick",
year = "2017",
month = "1",
day = "1",
doi = "10.1007/978-3-319-72413-3_4",
language = "English (US)",
isbn = "9783319724126",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "59--68",
editor = "Ashish Sureka and Sharma Chakravarthy and Reddy, {P. Krishna} and Subhash Bhalla",
booktitle = "Big Data Analytics - 5th International Conference, BDA 2017, Proceedings",

}

TY - GEN

T1 - Holistic analysis of multi-source, multi-feature data

T2 - Modeling and computation challenges

AU - Santra, Abhishek

AU - Bhowmick, Sanjukta

PY - 2017/1/1

Y1 - 2017/1/1

N2 - As a result of our increased ability to collect data from different sources, many real-world datasets are increasingly becoming multi-featured and these features can also be of different types. Examples of such multi-feature data include different modes of interactions among people (Facebook, Twitter, LinkedIn,..) or traffic accidents associated with diverse factors (speed, light conditions, weather,..). Efficiently modeling and analyzing these complex datasets to obtain actionable knowledge presents several challenges. Traditional approaches, such as using single layer networks (or monoplexes) may not be sufficient or appropriate for modeling and computation scalability. Recently, multiplexes have been proposed for the elegant handling of such data. In this position paper, we elaborate on different types of multiplexes (homogeneous, heterogeneous and hybrid) for modeling different types of data. The benefits of this modeling in terms of ease, understanding, and usage are highlighted. However, this model brings with it a new set of challenges for its analysis. The bulk of the paper discusses these challenges and the advantages of using this approach. With the right tools, both computation and storage can be reduced in addition to accommodating scalability.

AB - As a result of our increased ability to collect data from different sources, many real-world datasets are increasingly becoming multi-featured and these features can also be of different types. Examples of such multi-feature data include different modes of interactions among people (Facebook, Twitter, LinkedIn,..) or traffic accidents associated with diverse factors (speed, light conditions, weather,..). Efficiently modeling and analyzing these complex datasets to obtain actionable knowledge presents several challenges. Traditional approaches, such as using single layer networks (or monoplexes) may not be sufficient or appropriate for modeling and computation scalability. Recently, multiplexes have been proposed for the elegant handling of such data. In this position paper, we elaborate on different types of multiplexes (homogeneous, heterogeneous and hybrid) for modeling different types of data. The benefits of this modeling in terms of ease, understanding, and usage are highlighted. However, this model brings with it a new set of challenges for its analysis. The bulk of the paper discusses these challenges and the advantages of using this approach. With the right tools, both computation and storage can be reduced in addition to accommodating scalability.

KW - Aggregation functions

KW - Big data analytics

KW - Graph analysis and query processing

KW - Lossless composability

KW - Multi-source, disparate data

KW - Multiplex

UR - http://www.scopus.com/inward/record.url?scp=85038213112&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85038213112&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-72413-3_4

DO - 10.1007/978-3-319-72413-3_4

M3 - Conference contribution

AN - SCOPUS:85038213112

SN - 9783319724126

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 59

EP - 68

BT - Big Data Analytics - 5th International Conference, BDA 2017, Proceedings

A2 - Sureka, Ashish

A2 - Chakravarthy, Sharma

A2 - Reddy, P. Krishna

A2 - Bhalla, Subhash

PB - Springer Verlag

ER -