A genetic algorithm for variable selection in logistic regression analysis of radiotherapy treatment outcomes

Olivier Gayou, Shiva K. Das, Su Min Zhou, Lawrence B. Marks, David S. Parda, Moyed Miften

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

A given outcome of radiotherapy treatment can be modeled by analyzing its correlation with a combination of dosimetric, physiological, biological, and clinical factors, through a logistic regression fit of a large patient population. The quality of the fit is measured by the combination of the predictive power of this particular set of factors and the statistical significance of the individual factors in the model. We developed a genetic algorithm (GA), in which a small sample of all the possible combinations of variables are fitted to the patient data. New models are derived from the best models, through crossover and mutation operations, and are in turn fitted. The process is repeated until the sample converges to the combination of factors that best predicts the outcome. The GA was tested on a data set that investigated the incidence of lung injury in NSCLC patients treated with 3DCRT. The GA identified a model with two variables as the best predictor of radiation pneumonitis: the V30 (p=0.048) and the ongoing use of tobacco at the time of referral (p=0.074). This two-variable model was confirmed as the best model by analyzing all possible combinations of factors. In conclusion, genetic algorithms provide a reliable and fast way to select significant factors in logistic regression analysis of large clinical studies.

Original languageEnglish (US)
Pages (from-to)5426-5433
Number of pages8
JournalMedical physics
Volume35
Issue number12
DOIs
StatePublished - 2008

Fingerprint

Radiotherapy
Logistic Models
Regression Analysis
Radiation Pneumonitis
Biological Factors
Tobacco Use
Lung Injury
Referral and Consultation
Mutation
Incidence
Population

Keywords

  • Genetic algorithm
  • Logistic regression
  • Radiobiological modeling
  • Variable selection

ASJC Scopus subject areas

  • Biophysics
  • Radiology Nuclear Medicine and imaging

Cite this

A genetic algorithm for variable selection in logistic regression analysis of radiotherapy treatment outcomes. / Gayou, Olivier; Das, Shiva K.; Zhou, Su Min; Marks, Lawrence B.; Parda, David S.; Miften, Moyed.

In: Medical physics, Vol. 35, No. 12, 2008, p. 5426-5433.

Research output: Contribution to journalArticle

Gayou, Olivier ; Das, Shiva K. ; Zhou, Su Min ; Marks, Lawrence B. ; Parda, David S. ; Miften, Moyed. / A genetic algorithm for variable selection in logistic regression analysis of radiotherapy treatment outcomes. In: Medical physics. 2008 ; Vol. 35, No. 12. pp. 5426-5433.
@article{05bc475636df4cb09c8201a655733a48,
title = "A genetic algorithm for variable selection in logistic regression analysis of radiotherapy treatment outcomes",
abstract = "A given outcome of radiotherapy treatment can be modeled by analyzing its correlation with a combination of dosimetric, physiological, biological, and clinical factors, through a logistic regression fit of a large patient population. The quality of the fit is measured by the combination of the predictive power of this particular set of factors and the statistical significance of the individual factors in the model. We developed a genetic algorithm (GA), in which a small sample of all the possible combinations of variables are fitted to the patient data. New models are derived from the best models, through crossover and mutation operations, and are in turn fitted. The process is repeated until the sample converges to the combination of factors that best predicts the outcome. The GA was tested on a data set that investigated the incidence of lung injury in NSCLC patients treated with 3DCRT. The GA identified a model with two variables as the best predictor of radiation pneumonitis: the V30 (p=0.048) and the ongoing use of tobacco at the time of referral (p=0.074). This two-variable model was confirmed as the best model by analyzing all possible combinations of factors. In conclusion, genetic algorithms provide a reliable and fast way to select significant factors in logistic regression analysis of large clinical studies.",
keywords = "Genetic algorithm, Logistic regression, Radiobiological modeling, Variable selection",
author = "Olivier Gayou and Das, {Shiva K.} and Zhou, {Su Min} and Marks, {Lawrence B.} and Parda, {David S.} and Moyed Miften",
year = "2008",
doi = "10.1118/1.3005974",
language = "English (US)",
volume = "35",
pages = "5426--5433",
journal = "Medical Physics",
issn = "0094-2405",
publisher = "AAPM - American Association of Physicists in Medicine",
number = "12",

}

TY - JOUR

T1 - A genetic algorithm for variable selection in logistic regression analysis of radiotherapy treatment outcomes

AU - Gayou, Olivier

AU - Das, Shiva K.

AU - Zhou, Su Min

AU - Marks, Lawrence B.

AU - Parda, David S.

AU - Miften, Moyed

PY - 2008

Y1 - 2008

N2 - A given outcome of radiotherapy treatment can be modeled by analyzing its correlation with a combination of dosimetric, physiological, biological, and clinical factors, through a logistic regression fit of a large patient population. The quality of the fit is measured by the combination of the predictive power of this particular set of factors and the statistical significance of the individual factors in the model. We developed a genetic algorithm (GA), in which a small sample of all the possible combinations of variables are fitted to the patient data. New models are derived from the best models, through crossover and mutation operations, and are in turn fitted. The process is repeated until the sample converges to the combination of factors that best predicts the outcome. The GA was tested on a data set that investigated the incidence of lung injury in NSCLC patients treated with 3DCRT. The GA identified a model with two variables as the best predictor of radiation pneumonitis: the V30 (p=0.048) and the ongoing use of tobacco at the time of referral (p=0.074). This two-variable model was confirmed as the best model by analyzing all possible combinations of factors. In conclusion, genetic algorithms provide a reliable and fast way to select significant factors in logistic regression analysis of large clinical studies.

AB - A given outcome of radiotherapy treatment can be modeled by analyzing its correlation with a combination of dosimetric, physiological, biological, and clinical factors, through a logistic regression fit of a large patient population. The quality of the fit is measured by the combination of the predictive power of this particular set of factors and the statistical significance of the individual factors in the model. We developed a genetic algorithm (GA), in which a small sample of all the possible combinations of variables are fitted to the patient data. New models are derived from the best models, through crossover and mutation operations, and are in turn fitted. The process is repeated until the sample converges to the combination of factors that best predicts the outcome. The GA was tested on a data set that investigated the incidence of lung injury in NSCLC patients treated with 3DCRT. The GA identified a model with two variables as the best predictor of radiation pneumonitis: the V30 (p=0.048) and the ongoing use of tobacco at the time of referral (p=0.074). This two-variable model was confirmed as the best model by analyzing all possible combinations of factors. In conclusion, genetic algorithms provide a reliable and fast way to select significant factors in logistic regression analysis of large clinical studies.

KW - Genetic algorithm

KW - Logistic regression

KW - Radiobiological modeling

KW - Variable selection

UR - http://www.scopus.com/inward/record.url?scp=56749095591&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=56749095591&partnerID=8YFLogxK

U2 - 10.1118/1.3005974

DO - 10.1118/1.3005974

M3 - Article

C2 - 19175102

AN - SCOPUS:56749095591

VL - 35

SP - 5426

EP - 5433

JO - Medical Physics

JF - Medical Physics

SN - 0094-2405

IS - 12

ER -