Seed selection algorithm through K-means on optimal number of clusters

Kuntal Chowdhury, Debasis Chaudhuri, Arup Kumar Pal, Ashok Samal

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Clustering is one of the important unsupervised learning in data mining to group the similar features. The growing point of the cluster is known as a seed. To select the appropriate seed of a cluster is an important criterion of any seed based clustering technique. The performance of seed based algorithms are dependent on initial cluster center selection and the optimal number of clusters in an unknown data set. Cluster quality and an optimal number of clusters are the important issues in cluster analysis. In this paper, the proposed seed point selection algorithm has been applied to 3 band image data and 2D discrete data. This algorithm selects the seed point using the concept of maximization of the joint probability of pixel intensities with the distance restriction criteria. The optimal number of clusters has been decided on the basis of the combination of seven different cluster validity indices. We have also compared the results of our proposed seed selection algorithm on an optimal number of clusters using K-Means clustering with other classical seed selection algorithms applied through K-Means Clustering in terms of seed generation time (SGT), cluster building Time (CBT), segmentation entropy and the number of iterations (NOTK−means). We have also made the analysis of CPU time and no. of iterations of our proposed seed selection method with other clustering algorithms.

Original languageEnglish (US)
Pages (from-to)18617-18651
Number of pages35
JournalMultimedia Tools and Applications
Volume78
Issue number13
DOIs
StatePublished - Jul 15 2019

Fingerprint

Seed
Unsupervised learning
Cluster analysis
Clustering algorithms
Program processors
Data mining
Entropy
Pixels

Keywords

  • Cluster building time
  • Cluster validity indices
  • Clustering
  • Joint probability
  • K-means
  • Seed generation time
  • Seed point
  • Segmentation entropy

ASJC Scopus subject areas

  • Software
  • Media Technology
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Seed selection algorithm through K-means on optimal number of clusters. / Chowdhury, Kuntal; Chaudhuri, Debasis; Pal, Arup Kumar; Samal, Ashok.

In: Multimedia Tools and Applications, Vol. 78, No. 13, 15.07.2019, p. 18617-18651.

Research output: Contribution to journalArticle

Chowdhury, Kuntal ; Chaudhuri, Debasis ; Pal, Arup Kumar ; Samal, Ashok. / Seed selection algorithm through K-means on optimal number of clusters. In: Multimedia Tools and Applications. 2019 ; Vol. 78, No. 13. pp. 18617-18651.
@article{2cd81b23301a409d890f0fa6ed2377c6,
title = "Seed selection algorithm through K-means on optimal number of clusters",
abstract = "Clustering is one of the important unsupervised learning in data mining to group the similar features. The growing point of the cluster is known as a seed. To select the appropriate seed of a cluster is an important criterion of any seed based clustering technique. The performance of seed based algorithms are dependent on initial cluster center selection and the optimal number of clusters in an unknown data set. Cluster quality and an optimal number of clusters are the important issues in cluster analysis. In this paper, the proposed seed point selection algorithm has been applied to 3 band image data and 2D discrete data. This algorithm selects the seed point using the concept of maximization of the joint probability of pixel intensities with the distance restriction criteria. The optimal number of clusters has been decided on the basis of the combination of seven different cluster validity indices. We have also compared the results of our proposed seed selection algorithm on an optimal number of clusters using K-Means clustering with other classical seed selection algorithms applied through K-Means Clustering in terms of seed generation time (SGT), cluster building Time (CBT), segmentation entropy and the number of iterations (NOTK−means). We have also made the analysis of CPU time and no. of iterations of our proposed seed selection method with other clustering algorithms.",
keywords = "Cluster building time, Cluster validity indices, Clustering, Joint probability, K-means, Seed generation time, Seed point, Segmentation entropy",
author = "Kuntal Chowdhury and Debasis Chaudhuri and Pal, {Arup Kumar} and Ashok Samal",
year = "2019",
month = "7",
day = "15",
doi = "10.1007/s11042-018-7100-4",
language = "English (US)",
volume = "78",
pages = "18617--18651",
journal = "Multimedia Tools and Applications",
issn = "1380-7501",
publisher = "Springer Netherlands",
number = "13",

}

TY - JOUR

T1 - Seed selection algorithm through K-means on optimal number of clusters

AU - Chowdhury, Kuntal

AU - Chaudhuri, Debasis

AU - Pal, Arup Kumar

AU - Samal, Ashok

PY - 2019/7/15

Y1 - 2019/7/15

N2 - Clustering is one of the important unsupervised learning in data mining to group the similar features. The growing point of the cluster is known as a seed. To select the appropriate seed of a cluster is an important criterion of any seed based clustering technique. The performance of seed based algorithms are dependent on initial cluster center selection and the optimal number of clusters in an unknown data set. Cluster quality and an optimal number of clusters are the important issues in cluster analysis. In this paper, the proposed seed point selection algorithm has been applied to 3 band image data and 2D discrete data. This algorithm selects the seed point using the concept of maximization of the joint probability of pixel intensities with the distance restriction criteria. The optimal number of clusters has been decided on the basis of the combination of seven different cluster validity indices. We have also compared the results of our proposed seed selection algorithm on an optimal number of clusters using K-Means clustering with other classical seed selection algorithms applied through K-Means Clustering in terms of seed generation time (SGT), cluster building Time (CBT), segmentation entropy and the number of iterations (NOTK−means). We have also made the analysis of CPU time and no. of iterations of our proposed seed selection method with other clustering algorithms.

AB - Clustering is one of the important unsupervised learning in data mining to group the similar features. The growing point of the cluster is known as a seed. To select the appropriate seed of a cluster is an important criterion of any seed based clustering technique. The performance of seed based algorithms are dependent on initial cluster center selection and the optimal number of clusters in an unknown data set. Cluster quality and an optimal number of clusters are the important issues in cluster analysis. In this paper, the proposed seed point selection algorithm has been applied to 3 band image data and 2D discrete data. This algorithm selects the seed point using the concept of maximization of the joint probability of pixel intensities with the distance restriction criteria. The optimal number of clusters has been decided on the basis of the combination of seven different cluster validity indices. We have also compared the results of our proposed seed selection algorithm on an optimal number of clusters using K-Means clustering with other classical seed selection algorithms applied through K-Means Clustering in terms of seed generation time (SGT), cluster building Time (CBT), segmentation entropy and the number of iterations (NOTK−means). We have also made the analysis of CPU time and no. of iterations of our proposed seed selection method with other clustering algorithms.

KW - Cluster building time

KW - Cluster validity indices

KW - Clustering

KW - Joint probability

KW - K-means

KW - Seed generation time

KW - Seed point

KW - Segmentation entropy

UR - http://www.scopus.com/inward/record.url?scp=85066150147&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066150147&partnerID=8YFLogxK

U2 - 10.1007/s11042-018-7100-4

DO - 10.1007/s11042-018-7100-4

M3 - Article

AN - SCOPUS:85066150147

VL - 78

SP - 18617

EP - 18651

JO - Multimedia Tools and Applications

JF - Multimedia Tools and Applications

SN - 1380-7501

IS - 13

ER -