Feature selection and effective classifiers

Jitender S Deogun, Suresh K. Choubey, Vijay V. Raghavan, Hayri Sever

Research output: Contribution to journalArticle

39 Citations (Scopus)

Abstract

In this article, we develop and analyze four algorithms for feature selection in the context of rough set methodology. The initial state and the feasibility criterion of all these algorithms are the same. That is, they start with a given feature set and progressively remove features, while controlling the amount of degradation in classification quality. These algorithms, however, differ in the heuristics used for pruning the search space of features. Our experimental results confirm the expected relationship between the time complexity of these algorithms and the classification accuracy of the resulting upper classifiers. Our experiments demonstrate that a θ-reduct of a given feature set can be found efficiently. Although we have adopted upper classifiers in our investigations, the algorithms presented can, however, be used with any method of deriving a classifier, where the quality of classification is a monotonically decreasing function of the size of the feature set. We compare the performance of upper classifiers with those of lower classifiers. We find that upper classifiers perform better than lower classifiers for a duodenal ulcer data set. This should be generally true when there is a small number of elements in the boundary region. An upper classifier has some important features that make it suitable for data mining applications. In particular, we have shown that the upper classifiers can be summarized at a desired level of abstraction by using extended decision tables. We also point out that an upper classifier results in an inconsistent decision algorithm, which can be interpreted deterministically or non-deterministically to obtain a consistent decision algorithm.

Original languageEnglish (US)
Pages (from-to)423-434
Number of pages12
JournalJournal of the American Society for Information Science
Volume49
Issue number5
StatePublished - Dec 1 1998

Fingerprint

Feature extraction
Classifiers
Feature selection
Classifier
abstraction
Decision tables
heuristics
Data mining
experiment
methodology
Degradation
performance

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Deogun, J. S., Choubey, S. K., Raghavan, V. V., & Sever, H. (1998). Feature selection and effective classifiers. Journal of the American Society for Information Science, 49(5), 423-434.

Feature selection and effective classifiers. / Deogun, Jitender S; Choubey, Suresh K.; Raghavan, Vijay V.; Sever, Hayri.

In: Journal of the American Society for Information Science, Vol. 49, No. 5, 01.12.1998, p. 423-434.

Research output: Contribution to journalArticle

Deogun, JS, Choubey, SK, Raghavan, VV & Sever, H 1998, 'Feature selection and effective classifiers', Journal of the American Society for Information Science, vol. 49, no. 5, pp. 423-434.
Deogun, Jitender S ; Choubey, Suresh K. ; Raghavan, Vijay V. ; Sever, Hayri. / Feature selection and effective classifiers. In: Journal of the American Society for Information Science. 1998 ; Vol. 49, No. 5. pp. 423-434.
@article{d689fdd2398645e78b99be03da8eae6d,
title = "Feature selection and effective classifiers",
abstract = "In this article, we develop and analyze four algorithms for feature selection in the context of rough set methodology. The initial state and the feasibility criterion of all these algorithms are the same. That is, they start with a given feature set and progressively remove features, while controlling the amount of degradation in classification quality. These algorithms, however, differ in the heuristics used for pruning the search space of features. Our experimental results confirm the expected relationship between the time complexity of these algorithms and the classification accuracy of the resulting upper classifiers. Our experiments demonstrate that a θ-reduct of a given feature set can be found efficiently. Although we have adopted upper classifiers in our investigations, the algorithms presented can, however, be used with any method of deriving a classifier, where the quality of classification is a monotonically decreasing function of the size of the feature set. We compare the performance of upper classifiers with those of lower classifiers. We find that upper classifiers perform better than lower classifiers for a duodenal ulcer data set. This should be generally true when there is a small number of elements in the boundary region. An upper classifier has some important features that make it suitable for data mining applications. In particular, we have shown that the upper classifiers can be summarized at a desired level of abstraction by using extended decision tables. We also point out that an upper classifier results in an inconsistent decision algorithm, which can be interpreted deterministically or non-deterministically to obtain a consistent decision algorithm.",
author = "Deogun, {Jitender S} and Choubey, {Suresh K.} and Raghavan, {Vijay V.} and Hayri Sever",
year = "1998",
month = "12",
day = "1",
language = "English (US)",
volume = "49",
pages = "423--434",
journal = "Journal of the Association for Information Science and Technology",
issn = "2330-1635",
publisher = "John Wiley and Sons Ltd",
number = "5",

}

TY - JOUR

T1 - Feature selection and effective classifiers

AU - Deogun, Jitender S

AU - Choubey, Suresh K.

AU - Raghavan, Vijay V.

AU - Sever, Hayri

PY - 1998/12/1

Y1 - 1998/12/1

N2 - In this article, we develop and analyze four algorithms for feature selection in the context of rough set methodology. The initial state and the feasibility criterion of all these algorithms are the same. That is, they start with a given feature set and progressively remove features, while controlling the amount of degradation in classification quality. These algorithms, however, differ in the heuristics used for pruning the search space of features. Our experimental results confirm the expected relationship between the time complexity of these algorithms and the classification accuracy of the resulting upper classifiers. Our experiments demonstrate that a θ-reduct of a given feature set can be found efficiently. Although we have adopted upper classifiers in our investigations, the algorithms presented can, however, be used with any method of deriving a classifier, where the quality of classification is a monotonically decreasing function of the size of the feature set. We compare the performance of upper classifiers with those of lower classifiers. We find that upper classifiers perform better than lower classifiers for a duodenal ulcer data set. This should be generally true when there is a small number of elements in the boundary region. An upper classifier has some important features that make it suitable for data mining applications. In particular, we have shown that the upper classifiers can be summarized at a desired level of abstraction by using extended decision tables. We also point out that an upper classifier results in an inconsistent decision algorithm, which can be interpreted deterministically or non-deterministically to obtain a consistent decision algorithm.

AB - In this article, we develop and analyze four algorithms for feature selection in the context of rough set methodology. The initial state and the feasibility criterion of all these algorithms are the same. That is, they start with a given feature set and progressively remove features, while controlling the amount of degradation in classification quality. These algorithms, however, differ in the heuristics used for pruning the search space of features. Our experimental results confirm the expected relationship between the time complexity of these algorithms and the classification accuracy of the resulting upper classifiers. Our experiments demonstrate that a θ-reduct of a given feature set can be found efficiently. Although we have adopted upper classifiers in our investigations, the algorithms presented can, however, be used with any method of deriving a classifier, where the quality of classification is a monotonically decreasing function of the size of the feature set. We compare the performance of upper classifiers with those of lower classifiers. We find that upper classifiers perform better than lower classifiers for a duodenal ulcer data set. This should be generally true when there is a small number of elements in the boundary region. An upper classifier has some important features that make it suitable for data mining applications. In particular, we have shown that the upper classifiers can be summarized at a desired level of abstraction by using extended decision tables. We also point out that an upper classifier results in an inconsistent decision algorithm, which can be interpreted deterministically or non-deterministically to obtain a consistent decision algorithm.

UR - http://www.scopus.com/inward/record.url?scp=0032046462&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032046462&partnerID=8YFLogxK

M3 - Article

VL - 49

SP - 423

EP - 434

JO - Journal of the Association for Information Science and Technology

JF - Journal of the Association for Information Science and Technology

SN - 2330-1635

IS - 5

ER -