Investigation of the support vector machine algorithm to predict lung radiation-induced pneumonitis

Shifeng Chen, Sumin Zhou, Fang Fang Yin, Lawrence B. Marks, Shiva K. Das

Research output: Contribution to journalArticle

49 Citations (Scopus)

Abstract

The purpose of this study is to build and test a support vector machine (SVM) model to predict for the occurrence of lung radiation-induced Grade 2+ pneumonitis. SVM is a sophisticated statistical technique capable of separating the two categories of patients (with/without pneumonitis) using a boundary defined by a complex hypersurface. Despite the complexity, the SVM boundary is only minimally influenced by outliers that are difficult to separate. By contrast, the simple hyperplane boundary computed by the more commonly used and related linear discriminant analysis method is heavily influenced by outliers. Two SVM models were built using data from 219 patients with lung cancer treated using radiotherapy (34 diagnosed with pneumonitis). One model (SVMall) selected input features from all dose and non-dose factors. For comparison, the other model (SVMdose) selected input features only from lung dose-volume factors. Model predictive ability was evaluated using ten-fold cross-validation and receiver operating characteristics (ROC) analysis. For the model SVMall, the area under the cross-validated ROC curve was 0.76 (sensitivityspecificity= 74%75%). Compared to the corresponding SVMdose area of 0.71 (sensitivityspecificity=68%68%), the predictive ability of SVMall was improved, indicating that non-dose features are important contributors to separating patients with and without pneumonitis. Among the input features selected by model SVMall, the two with highest importance for predicting lung pneumonitis were: (a) generalized equivalent uniform doses close to the mean lung dose, and (b) chemotherapy prior to radiotherapy. The model SVMall is publicly available via internet access.

Original languageEnglish (US)
Pages (from-to)3808-3814
Number of pages7
JournalMedical physics
Volume34
Issue number10
DOIs
StatePublished - Jan 1 2007

Fingerprint

Radiation Pneumonitis
Pneumonia
Lung
ROC Curve
Radiotherapy
Discriminant Analysis
Internet
Lung Neoplasms
Support Vector Machine
Radiation
Drug Therapy

Keywords

  • Modeling
  • Prediction
  • Radiation pneumonitis
  • Support vector machines

ASJC Scopus subject areas

  • Biophysics
  • Radiology Nuclear Medicine and imaging

Cite this

Investigation of the support vector machine algorithm to predict lung radiation-induced pneumonitis. / Chen, Shifeng; Zhou, Sumin; Yin, Fang Fang; Marks, Lawrence B.; Das, Shiva K.

In: Medical physics, Vol. 34, No. 10, 01.01.2007, p. 3808-3814.

Research output: Contribution to journalArticle

Chen, Shifeng ; Zhou, Sumin ; Yin, Fang Fang ; Marks, Lawrence B. ; Das, Shiva K. / Investigation of the support vector machine algorithm to predict lung radiation-induced pneumonitis. In: Medical physics. 2007 ; Vol. 34, No. 10. pp. 3808-3814.
@article{427a55cb05fd42df9487d5289a226415,
title = "Investigation of the support vector machine algorithm to predict lung radiation-induced pneumonitis",
abstract = "The purpose of this study is to build and test a support vector machine (SVM) model to predict for the occurrence of lung radiation-induced Grade 2+ pneumonitis. SVM is a sophisticated statistical technique capable of separating the two categories of patients (with/without pneumonitis) using a boundary defined by a complex hypersurface. Despite the complexity, the SVM boundary is only minimally influenced by outliers that are difficult to separate. By contrast, the simple hyperplane boundary computed by the more commonly used and related linear discriminant analysis method is heavily influenced by outliers. Two SVM models were built using data from 219 patients with lung cancer treated using radiotherapy (34 diagnosed with pneumonitis). One model (SVMall) selected input features from all dose and non-dose factors. For comparison, the other model (SVMdose) selected input features only from lung dose-volume factors. Model predictive ability was evaluated using ten-fold cross-validation and receiver operating characteristics (ROC) analysis. For the model SVMall, the area under the cross-validated ROC curve was 0.76 (sensitivityspecificity= 74{\%}75{\%}). Compared to the corresponding SVMdose area of 0.71 (sensitivityspecificity=68{\%}68{\%}), the predictive ability of SVMall was improved, indicating that non-dose features are important contributors to separating patients with and without pneumonitis. Among the input features selected by model SVMall, the two with highest importance for predicting lung pneumonitis were: (a) generalized equivalent uniform doses close to the mean lung dose, and (b) chemotherapy prior to radiotherapy. The model SVMall is publicly available via internet access.",
keywords = "Modeling, Prediction, Radiation pneumonitis, Support vector machines",
author = "Shifeng Chen and Sumin Zhou and Yin, {Fang Fang} and Marks, {Lawrence B.} and Das, {Shiva K.}",
year = "2007",
month = "1",
day = "1",
doi = "10.1118/1.2776669",
language = "English (US)",
volume = "34",
pages = "3808--3814",
journal = "Medical Physics",
issn = "0094-2405",
publisher = "AAPM - American Association of Physicists in Medicine",
number = "10",

}

TY - JOUR

T1 - Investigation of the support vector machine algorithm to predict lung radiation-induced pneumonitis

AU - Chen, Shifeng

AU - Zhou, Sumin

AU - Yin, Fang Fang

AU - Marks, Lawrence B.

AU - Das, Shiva K.

PY - 2007/1/1

Y1 - 2007/1/1

N2 - The purpose of this study is to build and test a support vector machine (SVM) model to predict for the occurrence of lung radiation-induced Grade 2+ pneumonitis. SVM is a sophisticated statistical technique capable of separating the two categories of patients (with/without pneumonitis) using a boundary defined by a complex hypersurface. Despite the complexity, the SVM boundary is only minimally influenced by outliers that are difficult to separate. By contrast, the simple hyperplane boundary computed by the more commonly used and related linear discriminant analysis method is heavily influenced by outliers. Two SVM models were built using data from 219 patients with lung cancer treated using radiotherapy (34 diagnosed with pneumonitis). One model (SVMall) selected input features from all dose and non-dose factors. For comparison, the other model (SVMdose) selected input features only from lung dose-volume factors. Model predictive ability was evaluated using ten-fold cross-validation and receiver operating characteristics (ROC) analysis. For the model SVMall, the area under the cross-validated ROC curve was 0.76 (sensitivityspecificity= 74%75%). Compared to the corresponding SVMdose area of 0.71 (sensitivityspecificity=68%68%), the predictive ability of SVMall was improved, indicating that non-dose features are important contributors to separating patients with and without pneumonitis. Among the input features selected by model SVMall, the two with highest importance for predicting lung pneumonitis were: (a) generalized equivalent uniform doses close to the mean lung dose, and (b) chemotherapy prior to radiotherapy. The model SVMall is publicly available via internet access.

AB - The purpose of this study is to build and test a support vector machine (SVM) model to predict for the occurrence of lung radiation-induced Grade 2+ pneumonitis. SVM is a sophisticated statistical technique capable of separating the two categories of patients (with/without pneumonitis) using a boundary defined by a complex hypersurface. Despite the complexity, the SVM boundary is only minimally influenced by outliers that are difficult to separate. By contrast, the simple hyperplane boundary computed by the more commonly used and related linear discriminant analysis method is heavily influenced by outliers. Two SVM models were built using data from 219 patients with lung cancer treated using radiotherapy (34 diagnosed with pneumonitis). One model (SVMall) selected input features from all dose and non-dose factors. For comparison, the other model (SVMdose) selected input features only from lung dose-volume factors. Model predictive ability was evaluated using ten-fold cross-validation and receiver operating characteristics (ROC) analysis. For the model SVMall, the area under the cross-validated ROC curve was 0.76 (sensitivityspecificity= 74%75%). Compared to the corresponding SVMdose area of 0.71 (sensitivityspecificity=68%68%), the predictive ability of SVMall was improved, indicating that non-dose features are important contributors to separating patients with and without pneumonitis. Among the input features selected by model SVMall, the two with highest importance for predicting lung pneumonitis were: (a) generalized equivalent uniform doses close to the mean lung dose, and (b) chemotherapy prior to radiotherapy. The model SVMall is publicly available via internet access.

KW - Modeling

KW - Prediction

KW - Radiation pneumonitis

KW - Support vector machines

UR - http://www.scopus.com/inward/record.url?scp=34748905317&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34748905317&partnerID=8YFLogxK

U2 - 10.1118/1.2776669

DO - 10.1118/1.2776669

M3 - Article

C2 - 17985626

AN - SCOPUS:34748905317

VL - 34

SP - 3808

EP - 3814

JO - Medical Physics

JF - Medical Physics

SN - 0094-2405

IS - 10

ER -