A new clustering strategy with stochastic merging and removing based on kernel functions

Huimin Geng, Hesham H Ali

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

With hierarchical clustering methods, divisions or fusions, once made, are irrevocable. As a result, when two elements in a bottom-up algorithm are assigned to one cluster, they cannot subsequently be separated. Also, when a top-down algorithm separates two elements, they can't be rejoined. Such greedy property may lead to premature convergence and consequently lead to a clustering that is far from optimal. To overcome this problem, we propose a new Stochastic Message Passing Clustering (SMPC) method based on the Message Passing Clustering (MPC) algorithm introduced in our earlier work [1]. SMPC, as a generalized version of MPC, extends the clustering algorithm from a deterministic process to a stochastic process, adding two major advantages. First, in deciding the merging cluster pair, the influences of all clusters are quantified by probabilities, estimated by kernel functions based on their relative distances. Secondly, clustering can be undone to improve the clustering performance when the algorithm detects elements which don't have good probabilities inside the cluster and moves them outside. The test results on colon cancer gene-expression data show that SMPC performs better than the deterministic MPC or hierarchical clustering method.

Original languageEnglish (US)
Title of host publication2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts
Pages41-42
Number of pages2
DOIs
StatePublished - Dec 1 2005
Event2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts - Stanford, CA, United States
Duration: Aug 8 2005Aug 11 2005

Publication series

Name2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts

Conference

Conference2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts
CountryUnited States
CityStanford, CA
Period8/8/058/11/05

Fingerprint

Message passing
Merging
Clustering algorithms
Random processes
Gene expression
Fusion reactions

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Geng, H., & Ali, H. H. (2005). A new clustering strategy with stochastic merging and removing based on kernel functions. In 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts (pp. 41-42). [1540532] (2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts). https://doi.org/10.1109/CSBW.2005.10

A new clustering strategy with stochastic merging and removing based on kernel functions. / Geng, Huimin; Ali, Hesham H.

2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts. 2005. p. 41-42 1540532 (2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Geng, H & Ali, HH 2005, A new clustering strategy with stochastic merging and removing based on kernel functions. in 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts., 1540532, 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts, pp. 41-42, 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts, Stanford, CA, United States, 8/8/05. https://doi.org/10.1109/CSBW.2005.10
Geng H, Ali HH. A new clustering strategy with stochastic merging and removing based on kernel functions. In 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts. 2005. p. 41-42. 1540532. (2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts). https://doi.org/10.1109/CSBW.2005.10
Geng, Huimin ; Ali, Hesham H. / A new clustering strategy with stochastic merging and removing based on kernel functions. 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts. 2005. pp. 41-42 (2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts).
@inproceedings{59d365aedfd149d3b7b80502c323e4de,
title = "A new clustering strategy with stochastic merging and removing based on kernel functions",
abstract = "With hierarchical clustering methods, divisions or fusions, once made, are irrevocable. As a result, when two elements in a bottom-up algorithm are assigned to one cluster, they cannot subsequently be separated. Also, when a top-down algorithm separates two elements, they can't be rejoined. Such greedy property may lead to premature convergence and consequently lead to a clustering that is far from optimal. To overcome this problem, we propose a new Stochastic Message Passing Clustering (SMPC) method based on the Message Passing Clustering (MPC) algorithm introduced in our earlier work [1]. SMPC, as a generalized version of MPC, extends the clustering algorithm from a deterministic process to a stochastic process, adding two major advantages. First, in deciding the merging cluster pair, the influences of all clusters are quantified by probabilities, estimated by kernel functions based on their relative distances. Secondly, clustering can be undone to improve the clustering performance when the algorithm detects elements which don't have good probabilities inside the cluster and moves them outside. The test results on colon cancer gene-expression data show that SMPC performs better than the deterministic MPC or hierarchical clustering method.",
author = "Huimin Geng and Ali, {Hesham H}",
year = "2005",
month = "12",
day = "1",
doi = "10.1109/CSBW.2005.10",
language = "English (US)",
isbn = "0769524427",
series = "2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts",
pages = "41--42",
booktitle = "2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts",

}

TY - GEN

T1 - A new clustering strategy with stochastic merging and removing based on kernel functions

AU - Geng, Huimin

AU - Ali, Hesham H

PY - 2005/12/1

Y1 - 2005/12/1

N2 - With hierarchical clustering methods, divisions or fusions, once made, are irrevocable. As a result, when two elements in a bottom-up algorithm are assigned to one cluster, they cannot subsequently be separated. Also, when a top-down algorithm separates two elements, they can't be rejoined. Such greedy property may lead to premature convergence and consequently lead to a clustering that is far from optimal. To overcome this problem, we propose a new Stochastic Message Passing Clustering (SMPC) method based on the Message Passing Clustering (MPC) algorithm introduced in our earlier work [1]. SMPC, as a generalized version of MPC, extends the clustering algorithm from a deterministic process to a stochastic process, adding two major advantages. First, in deciding the merging cluster pair, the influences of all clusters are quantified by probabilities, estimated by kernel functions based on their relative distances. Secondly, clustering can be undone to improve the clustering performance when the algorithm detects elements which don't have good probabilities inside the cluster and moves them outside. The test results on colon cancer gene-expression data show that SMPC performs better than the deterministic MPC or hierarchical clustering method.

AB - With hierarchical clustering methods, divisions or fusions, once made, are irrevocable. As a result, when two elements in a bottom-up algorithm are assigned to one cluster, they cannot subsequently be separated. Also, when a top-down algorithm separates two elements, they can't be rejoined. Such greedy property may lead to premature convergence and consequently lead to a clustering that is far from optimal. To overcome this problem, we propose a new Stochastic Message Passing Clustering (SMPC) method based on the Message Passing Clustering (MPC) algorithm introduced in our earlier work [1]. SMPC, as a generalized version of MPC, extends the clustering algorithm from a deterministic process to a stochastic process, adding two major advantages. First, in deciding the merging cluster pair, the influences of all clusters are quantified by probabilities, estimated by kernel functions based on their relative distances. Secondly, clustering can be undone to improve the clustering performance when the algorithm detects elements which don't have good probabilities inside the cluster and moves them outside. The test results on colon cancer gene-expression data show that SMPC performs better than the deterministic MPC or hierarchical clustering method.

UR - http://www.scopus.com/inward/record.url?scp=33749059189&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33749059189&partnerID=8YFLogxK

U2 - 10.1109/CSBW.2005.10

DO - 10.1109/CSBW.2005.10

M3 - Conference contribution

AN - SCOPUS:33749059189

SN - 0769524427

SN - 9780769524429

T3 - 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts

SP - 41

EP - 42

BT - 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts

ER -