Selection of predictor variables for pneumonia using neural networks and genetic algorithms

Paul S. Heckerling, B. S. Gerber, Thomas Gerald Tape, Robert Swift Wigton

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Background: Artificial neural networks (ANN) can be used to select sets of predictor variable that incorporate nonlinear interactions between variables. We used a genetic algorithm, with selection based on maximizing network accuracy and minimizing network input-layer cardinality, to evolve parsimonious sets of variables for predicting community-acquired pneumonia among patients with respiratory complaints. Methods: ANN were trained on data from 1044 patients in a training cohort, and were applied to 116 patients in a testing cohort. Chromosomes with binary genes representing input-layer variables were operated on by crossover recombination, mutation, and probabilistic selection based on a fitness function incorporating both network accuracy ond input-layer cardinality. Results: The genetic algorithm evolved best 10-variable sets that discriminated pneumonia in the training cohort (ROC areas, 0.838 for selection based on average cross entropy (ENT); 0.954 for selection based on ROC area (ROC)), and in the testing cohort (ROC areas, 0.847 for ENT selection; 0.963 for ROC selection), with no significant differences between cohorts. Best variable sets based on the genetic algorithm using ROC selection discriminated pneumonia more accurately than variable sets based on stepwise neural networks (ROC areas, 0.954 versus 0.879, p = 0.030), or stepwise logistic regression (ROC areas, 0.954 versus 0.830, p = 0.000). Variable sets of lower cardinalities were also evolved, which also accurately discriminated pneumonia. Conclusion: Variable sets derived using a genetic algorithm for neural networks accurately discriminated pneumonia from other respiratory conditions, and did so with greater accuracy than variables derived using stepwise neural networks or logistic regression in some cases.

Original languageEnglish (US)
Pages (from-to)89-97
Number of pages9
JournalMethods of Information in Medicine
Volume44
Issue number1
StatePublished - Apr 18 2005

Fingerprint

Pneumonia
Entropy
Logistic Models
Genetic Recombination
Chromosomes
Mutation
Genes

Keywords

  • Artificial neural networks
  • Genetic algorithms
  • Pneumonia

ASJC Scopus subject areas

  • Health Informatics
  • Advanced and Specialized Nursing
  • Health Information Management

Cite this

Selection of predictor variables for pneumonia using neural networks and genetic algorithms. / Heckerling, Paul S.; Gerber, B. S.; Tape, Thomas Gerald; Wigton, Robert Swift.

In: Methods of Information in Medicine, Vol. 44, No. 1, 18.04.2005, p. 89-97.

Research output: Contribution to journalArticle

@article{eb3dfb32b95047778df584a2ce792c2b,
title = "Selection of predictor variables for pneumonia using neural networks and genetic algorithms",
abstract = "Background: Artificial neural networks (ANN) can be used to select sets of predictor variable that incorporate nonlinear interactions between variables. We used a genetic algorithm, with selection based on maximizing network accuracy and minimizing network input-layer cardinality, to evolve parsimonious sets of variables for predicting community-acquired pneumonia among patients with respiratory complaints. Methods: ANN were trained on data from 1044 patients in a training cohort, and were applied to 116 patients in a testing cohort. Chromosomes with binary genes representing input-layer variables were operated on by crossover recombination, mutation, and probabilistic selection based on a fitness function incorporating both network accuracy ond input-layer cardinality. Results: The genetic algorithm evolved best 10-variable sets that discriminated pneumonia in the training cohort (ROC areas, 0.838 for selection based on average cross entropy (ENT); 0.954 for selection based on ROC area (ROC)), and in the testing cohort (ROC areas, 0.847 for ENT selection; 0.963 for ROC selection), with no significant differences between cohorts. Best variable sets based on the genetic algorithm using ROC selection discriminated pneumonia more accurately than variable sets based on stepwise neural networks (ROC areas, 0.954 versus 0.879, p = 0.030), or stepwise logistic regression (ROC areas, 0.954 versus 0.830, p = 0.000). Variable sets of lower cardinalities were also evolved, which also accurately discriminated pneumonia. Conclusion: Variable sets derived using a genetic algorithm for neural networks accurately discriminated pneumonia from other respiratory conditions, and did so with greater accuracy than variables derived using stepwise neural networks or logistic regression in some cases.",
keywords = "Artificial neural networks, Genetic algorithms, Pneumonia",
author = "Heckerling, {Paul S.} and Gerber, {B. S.} and Tape, {Thomas Gerald} and Wigton, {Robert Swift}",
year = "2005",
month = "4",
day = "18",
language = "English (US)",
volume = "44",
pages = "89--97",
journal = "Methods of Information in Medicine",
issn = "0026-1270",
publisher = "Schattauer GmbH",
number = "1",

}

TY - JOUR

T1 - Selection of predictor variables for pneumonia using neural networks and genetic algorithms

AU - Heckerling, Paul S.

AU - Gerber, B. S.

AU - Tape, Thomas Gerald

AU - Wigton, Robert Swift

PY - 2005/4/18

Y1 - 2005/4/18

N2 - Background: Artificial neural networks (ANN) can be used to select sets of predictor variable that incorporate nonlinear interactions between variables. We used a genetic algorithm, with selection based on maximizing network accuracy and minimizing network input-layer cardinality, to evolve parsimonious sets of variables for predicting community-acquired pneumonia among patients with respiratory complaints. Methods: ANN were trained on data from 1044 patients in a training cohort, and were applied to 116 patients in a testing cohort. Chromosomes with binary genes representing input-layer variables were operated on by crossover recombination, mutation, and probabilistic selection based on a fitness function incorporating both network accuracy ond input-layer cardinality. Results: The genetic algorithm evolved best 10-variable sets that discriminated pneumonia in the training cohort (ROC areas, 0.838 for selection based on average cross entropy (ENT); 0.954 for selection based on ROC area (ROC)), and in the testing cohort (ROC areas, 0.847 for ENT selection; 0.963 for ROC selection), with no significant differences between cohorts. Best variable sets based on the genetic algorithm using ROC selection discriminated pneumonia more accurately than variable sets based on stepwise neural networks (ROC areas, 0.954 versus 0.879, p = 0.030), or stepwise logistic regression (ROC areas, 0.954 versus 0.830, p = 0.000). Variable sets of lower cardinalities were also evolved, which also accurately discriminated pneumonia. Conclusion: Variable sets derived using a genetic algorithm for neural networks accurately discriminated pneumonia from other respiratory conditions, and did so with greater accuracy than variables derived using stepwise neural networks or logistic regression in some cases.

AB - Background: Artificial neural networks (ANN) can be used to select sets of predictor variable that incorporate nonlinear interactions between variables. We used a genetic algorithm, with selection based on maximizing network accuracy and minimizing network input-layer cardinality, to evolve parsimonious sets of variables for predicting community-acquired pneumonia among patients with respiratory complaints. Methods: ANN were trained on data from 1044 patients in a training cohort, and were applied to 116 patients in a testing cohort. Chromosomes with binary genes representing input-layer variables were operated on by crossover recombination, mutation, and probabilistic selection based on a fitness function incorporating both network accuracy ond input-layer cardinality. Results: The genetic algorithm evolved best 10-variable sets that discriminated pneumonia in the training cohort (ROC areas, 0.838 for selection based on average cross entropy (ENT); 0.954 for selection based on ROC area (ROC)), and in the testing cohort (ROC areas, 0.847 for ENT selection; 0.963 for ROC selection), with no significant differences between cohorts. Best variable sets based on the genetic algorithm using ROC selection discriminated pneumonia more accurately than variable sets based on stepwise neural networks (ROC areas, 0.954 versus 0.879, p = 0.030), or stepwise logistic regression (ROC areas, 0.954 versus 0.830, p = 0.000). Variable sets of lower cardinalities were also evolved, which also accurately discriminated pneumonia. Conclusion: Variable sets derived using a genetic algorithm for neural networks accurately discriminated pneumonia from other respiratory conditions, and did so with greater accuracy than variables derived using stepwise neural networks or logistic regression in some cases.

KW - Artificial neural networks

KW - Genetic algorithms

KW - Pneumonia

UR - http://www.scopus.com/inward/record.url?scp=16244378865&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=16244378865&partnerID=8YFLogxK

M3 - Article

VL - 44

SP - 89

EP - 97

JO - Methods of Information in Medicine

JF - Methods of Information in Medicine

SN - 0026-1270

IS - 1

ER -