Entering the black box of neural networks

A descriptive study of clinical variables predicting community-acquired pneumonia

Paul S. Heckerling, B. S. Gerber, Thomas Gerald Tape, Robert Swift Wigton

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

Objectives: Artificial neural networks have proved to be accurate predictive instruments in several medical domains, but have been criticized for failing to specify the information upon which their predictions are based. We used methods of relevance analysis and sensitivity analysis to determine the most important predictor variables for a validated neural network for community-acquired pneumonia. Methods: We studied a feed-forward, back-propagation neural network trained to predict pneumonia among patients presenting to an emergency department with fever or respiratory complaints. We used the methods of full retraining, weight elimination, constant substitution, linear substitution, and data permutation to identify a consensus set of important demographic, symptom, sign, and comorbidity predictors that influenced network output for pneumonia. We compared predictors identified by these methods to those identified by a weight propagation analysis based an the matrices of the network, and by logistic regression. Results: Predictors identified by these methods were clinically plausible, and were concordant with those identified by weight analysis, and by logistic regression using the same data. The methods were highly correlated in network error, and led to variable sets with errors below bootstrap 95% confidence intervals for networks with similar numbers of inputs. Scores for variable relevance tended to be higher with methods that precluded network retraining (weight elimination) or that permuted variable values (data permutation), compared with methods that permitted retraining (full retraining) or that approximated its effects (constant and linear substitution). Conclusion: Methods of relevance analysis and sensitivity analysis are useful for identifying important predictor variables used by artificial neural networks.

Original languageEnglish (US)
Pages (from-to)287-296
Number of pages10
JournalMethods of Information in Medicine
Volume42
Issue number3
StatePublished - Aug 1 2003

Fingerprint

Pneumonia
Weights and Measures
Logistic Models
Clinical Studies
Signs and Symptoms
Hospital Emergency Service
Comorbidity
Fever
Demography
Confidence Intervals

Keywords

  • Analysis
  • Diagnosis computer-assisted
  • Neural networks (computer)
  • Pneumonia
  • Sensitivity

ASJC Scopus subject areas

  • Health Informatics
  • Advanced and Specialized Nursing
  • Health Information Management

Cite this

Entering the black box of neural networks : A descriptive study of clinical variables predicting community-acquired pneumonia. / Heckerling, Paul S.; Gerber, B. S.; Tape, Thomas Gerald; Wigton, Robert Swift.

In: Methods of Information in Medicine, Vol. 42, No. 3, 01.08.2003, p. 287-296.

Research output: Contribution to journalArticle

@article{486a033996ef45cdb24105c0dee5cb1f,
title = "Entering the black box of neural networks: A descriptive study of clinical variables predicting community-acquired pneumonia",
abstract = "Objectives: Artificial neural networks have proved to be accurate predictive instruments in several medical domains, but have been criticized for failing to specify the information upon which their predictions are based. We used methods of relevance analysis and sensitivity analysis to determine the most important predictor variables for a validated neural network for community-acquired pneumonia. Methods: We studied a feed-forward, back-propagation neural network trained to predict pneumonia among patients presenting to an emergency department with fever or respiratory complaints. We used the methods of full retraining, weight elimination, constant substitution, linear substitution, and data permutation to identify a consensus set of important demographic, symptom, sign, and comorbidity predictors that influenced network output for pneumonia. We compared predictors identified by these methods to those identified by a weight propagation analysis based an the matrices of the network, and by logistic regression. Results: Predictors identified by these methods were clinically plausible, and were concordant with those identified by weight analysis, and by logistic regression using the same data. The methods were highly correlated in network error, and led to variable sets with errors below bootstrap 95{\%} confidence intervals for networks with similar numbers of inputs. Scores for variable relevance tended to be higher with methods that precluded network retraining (weight elimination) or that permuted variable values (data permutation), compared with methods that permitted retraining (full retraining) or that approximated its effects (constant and linear substitution). Conclusion: Methods of relevance analysis and sensitivity analysis are useful for identifying important predictor variables used by artificial neural networks.",
keywords = "Analysis, Diagnosis computer-assisted, Neural networks (computer), Pneumonia, Sensitivity",
author = "Heckerling, {Paul S.} and Gerber, {B. S.} and Tape, {Thomas Gerald} and Wigton, {Robert Swift}",
year = "2003",
month = "8",
day = "1",
language = "English (US)",
volume = "42",
pages = "287--296",
journal = "Methods of Information in Medicine",
issn = "0026-1270",
publisher = "Schattauer GmbH",
number = "3",

}

TY - JOUR

T1 - Entering the black box of neural networks

T2 - A descriptive study of clinical variables predicting community-acquired pneumonia

AU - Heckerling, Paul S.

AU - Gerber, B. S.

AU - Tape, Thomas Gerald

AU - Wigton, Robert Swift

PY - 2003/8/1

Y1 - 2003/8/1

N2 - Objectives: Artificial neural networks have proved to be accurate predictive instruments in several medical domains, but have been criticized for failing to specify the information upon which their predictions are based. We used methods of relevance analysis and sensitivity analysis to determine the most important predictor variables for a validated neural network for community-acquired pneumonia. Methods: We studied a feed-forward, back-propagation neural network trained to predict pneumonia among patients presenting to an emergency department with fever or respiratory complaints. We used the methods of full retraining, weight elimination, constant substitution, linear substitution, and data permutation to identify a consensus set of important demographic, symptom, sign, and comorbidity predictors that influenced network output for pneumonia. We compared predictors identified by these methods to those identified by a weight propagation analysis based an the matrices of the network, and by logistic regression. Results: Predictors identified by these methods were clinically plausible, and were concordant with those identified by weight analysis, and by logistic regression using the same data. The methods were highly correlated in network error, and led to variable sets with errors below bootstrap 95% confidence intervals for networks with similar numbers of inputs. Scores for variable relevance tended to be higher with methods that precluded network retraining (weight elimination) or that permuted variable values (data permutation), compared with methods that permitted retraining (full retraining) or that approximated its effects (constant and linear substitution). Conclusion: Methods of relevance analysis and sensitivity analysis are useful for identifying important predictor variables used by artificial neural networks.

AB - Objectives: Artificial neural networks have proved to be accurate predictive instruments in several medical domains, but have been criticized for failing to specify the information upon which their predictions are based. We used methods of relevance analysis and sensitivity analysis to determine the most important predictor variables for a validated neural network for community-acquired pneumonia. Methods: We studied a feed-forward, back-propagation neural network trained to predict pneumonia among patients presenting to an emergency department with fever or respiratory complaints. We used the methods of full retraining, weight elimination, constant substitution, linear substitution, and data permutation to identify a consensus set of important demographic, symptom, sign, and comorbidity predictors that influenced network output for pneumonia. We compared predictors identified by these methods to those identified by a weight propagation analysis based an the matrices of the network, and by logistic regression. Results: Predictors identified by these methods were clinically plausible, and were concordant with those identified by weight analysis, and by logistic regression using the same data. The methods were highly correlated in network error, and led to variable sets with errors below bootstrap 95% confidence intervals for networks with similar numbers of inputs. Scores for variable relevance tended to be higher with methods that precluded network retraining (weight elimination) or that permuted variable values (data permutation), compared with methods that permitted retraining (full retraining) or that approximated its effects (constant and linear substitution). Conclusion: Methods of relevance analysis and sensitivity analysis are useful for identifying important predictor variables used by artificial neural networks.

KW - Analysis

KW - Diagnosis computer-assisted

KW - Neural networks (computer)

KW - Pneumonia

KW - Sensitivity

UR - http://www.scopus.com/inward/record.url?scp=0037699357&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0037699357&partnerID=8YFLogxK

M3 - Article

VL - 42

SP - 287

EP - 296

JO - Methods of Information in Medicine

JF - Methods of Information in Medicine

SN - 0026-1270

IS - 3

ER -