A distributed infomap algorithm for scalable and high-quality community detection

Jianping Zeng, Hongfeng Yu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Community detection is essential to various graph analysis applications. Infomap is a graph clustering algorithm capable of achieving high-quality communities. However, it remains a very challenging problem to effectively apply Infomap on large graphs. By analyzing communication and workload patterns of Infomap and leveraging a distributed delegate partitioning and distribution method, we develop a new heuristic strategy to carefully coordinate the community constitution from the vertices of a graph in a distributed environment, and achieve the convergence of the distributed clustering algorithm. We have implemented our optimized algorithm using MPI (Message Passing Interface), which can be easily employed or extended to massively distributed computing systems. We analyze the correctness of our algorithm, and conduct an intensive experimental study to investigate the communication and computation cost of our distributed algorithm, which has not shown in previous work. The results demonstrate the scalability and the correctness of our distributed Infomap algorithm with large-scale real-world datasets.

Original languageEnglish (US)
Title of host publicationProceedings of the 47th International Conference on Parallel Processing, ICPP 2018
PublisherAssociation for Computing Machinery
ISBN (Print)9781450365109
DOIs
StatePublished - Aug 13 2018
Event47th International Conference on Parallel Processing, ICPP 2018 - Eugene, United States
Duration: Aug 14 2018Aug 16 2018

Publication series

NameACM International Conference Proceeding Series

Other

Other47th International Conference on Parallel Processing, ICPP 2018
CountryUnited States
CityEugene
Period8/14/188/16/18

Fingerprint

Parallel algorithms
Clustering algorithms
Communication
Message passing
Distributed computer systems
Scalability
Costs

Keywords

  • Accuracy
  • Community detection
  • Infomap
  • Large graphs
  • Scalability

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Cite this

Zeng, J., & Yu, H. (2018). A distributed infomap algorithm for scalable and high-quality community detection. In Proceedings of the 47th International Conference on Parallel Processing, ICPP 2018 [a4] (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3225058.3225137

A distributed infomap algorithm for scalable and high-quality community detection. / Zeng, Jianping; Yu, Hongfeng.

Proceedings of the 47th International Conference on Parallel Processing, ICPP 2018. Association for Computing Machinery, 2018. a4 (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zeng, J & Yu, H 2018, A distributed infomap algorithm for scalable and high-quality community detection. in Proceedings of the 47th International Conference on Parallel Processing, ICPP 2018., a4, ACM International Conference Proceeding Series, Association for Computing Machinery, 47th International Conference on Parallel Processing, ICPP 2018, Eugene, United States, 8/14/18. https://doi.org/10.1145/3225058.3225137
Zeng J, Yu H. A distributed infomap algorithm for scalable and high-quality community detection. In Proceedings of the 47th International Conference on Parallel Processing, ICPP 2018. Association for Computing Machinery. 2018. a4. (ACM International Conference Proceeding Series). https://doi.org/10.1145/3225058.3225137
Zeng, Jianping ; Yu, Hongfeng. / A distributed infomap algorithm for scalable and high-quality community detection. Proceedings of the 47th International Conference on Parallel Processing, ICPP 2018. Association for Computing Machinery, 2018. (ACM International Conference Proceeding Series).
@inproceedings{ac7a02ddb2e84c9a8b786fbf32ab83ba,
title = "A distributed infomap algorithm for scalable and high-quality community detection",
abstract = "Community detection is essential to various graph analysis applications. Infomap is a graph clustering algorithm capable of achieving high-quality communities. However, it remains a very challenging problem to effectively apply Infomap on large graphs. By analyzing communication and workload patterns of Infomap and leveraging a distributed delegate partitioning and distribution method, we develop a new heuristic strategy to carefully coordinate the community constitution from the vertices of a graph in a distributed environment, and achieve the convergence of the distributed clustering algorithm. We have implemented our optimized algorithm using MPI (Message Passing Interface), which can be easily employed or extended to massively distributed computing systems. We analyze the correctness of our algorithm, and conduct an intensive experimental study to investigate the communication and computation cost of our distributed algorithm, which has not shown in previous work. The results demonstrate the scalability and the correctness of our distributed Infomap algorithm with large-scale real-world datasets.",
keywords = "Accuracy, Community detection, Infomap, Large graphs, Scalability",
author = "Jianping Zeng and Hongfeng Yu",
year = "2018",
month = "8",
day = "13",
doi = "10.1145/3225058.3225137",
language = "English (US)",
isbn = "9781450365109",
series = "ACM International Conference Proceeding Series",
publisher = "Association for Computing Machinery",
booktitle = "Proceedings of the 47th International Conference on Parallel Processing, ICPP 2018",

}

TY - GEN

T1 - A distributed infomap algorithm for scalable and high-quality community detection

AU - Zeng, Jianping

AU - Yu, Hongfeng

PY - 2018/8/13

Y1 - 2018/8/13

N2 - Community detection is essential to various graph analysis applications. Infomap is a graph clustering algorithm capable of achieving high-quality communities. However, it remains a very challenging problem to effectively apply Infomap on large graphs. By analyzing communication and workload patterns of Infomap and leveraging a distributed delegate partitioning and distribution method, we develop a new heuristic strategy to carefully coordinate the community constitution from the vertices of a graph in a distributed environment, and achieve the convergence of the distributed clustering algorithm. We have implemented our optimized algorithm using MPI (Message Passing Interface), which can be easily employed or extended to massively distributed computing systems. We analyze the correctness of our algorithm, and conduct an intensive experimental study to investigate the communication and computation cost of our distributed algorithm, which has not shown in previous work. The results demonstrate the scalability and the correctness of our distributed Infomap algorithm with large-scale real-world datasets.

AB - Community detection is essential to various graph analysis applications. Infomap is a graph clustering algorithm capable of achieving high-quality communities. However, it remains a very challenging problem to effectively apply Infomap on large graphs. By analyzing communication and workload patterns of Infomap and leveraging a distributed delegate partitioning and distribution method, we develop a new heuristic strategy to carefully coordinate the community constitution from the vertices of a graph in a distributed environment, and achieve the convergence of the distributed clustering algorithm. We have implemented our optimized algorithm using MPI (Message Passing Interface), which can be easily employed or extended to massively distributed computing systems. We analyze the correctness of our algorithm, and conduct an intensive experimental study to investigate the communication and computation cost of our distributed algorithm, which has not shown in previous work. The results demonstrate the scalability and the correctness of our distributed Infomap algorithm with large-scale real-world datasets.

KW - Accuracy

KW - Community detection

KW - Infomap

KW - Large graphs

KW - Scalability

UR - http://www.scopus.com/inward/record.url?scp=85054803601&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054803601&partnerID=8YFLogxK

U2 - 10.1145/3225058.3225137

DO - 10.1145/3225058.3225137

M3 - Conference contribution

SN - 9781450365109

T3 - ACM International Conference Proceeding Series

BT - Proceedings of the 47th International Conference on Parallel Processing, ICPP 2018

PB - Association for Computing Machinery

ER -