Cluster analysis in the COPDGene study identifies subtypes of smokers with distinct patterns of airway disease and emphysema

Peter J. Castaldi, Jennifer Dy, James Ross, Yale Chang, George R. Washko, Douglas Curran-Everett, Andre Williams, David A. Lynch, Barry J. Make, James D. Crapo, Russ P. Bowler, Elizabeth A. Regan, John E. Hokanson, Greg L. Kinney, Meilan K. Han, Xavier Soler, Joseph W. Ramsdell, R. Graham Barr, Marilyn Foreman, Edwin Van BeekRichard Casaburi, Gerald J. Criner, Sharon M. Lutz, Steven I. Rennard, Stephanie Santorico, Frank C. Sciurba, Dawn L. Demeo, Craig P. Hersh, Edwin K. Silverman, Michael H. Cho

Research output: Contribution to journalArticle

77 Citations (Scopus)

Abstract

Background: There is notable heterogeneity in the clinical presentation of patients with COPD. To characterise this heterogeneity, we sought to identify subgroups of smokers by applying cluster analysis to data from the COPDGene study. Methods: We applied a clustering method, k-means, to data from 10 192 smokers in the COPDGene study. After splitting the sample into a training and validation set, we evaluated three sets of input features across a range of k (user-specified number of clusters). Stable solutions were tested for association with four COPD-related measures and five genetic variants previously associated with COPD at genome-wide significance. The results were confirmed in the validation set. Findings: We identified four clusters that can be characterised as (1) relatively resistant smokers (ie, no/mild obstruction and minimal emphysema despite heavy smoking), (2) mild upper zone emphysema-predominant, (3) airway disease-predominant and (4) severe emphysema. All clusters are strongly associated with COPD-related clinical characteristics, including exacerbations and dyspnoea (p<0.001). We found strong genetic associations between the mild upper zone emphysema group and rs1980057 near HHIP, and between the severe emphysema group and rs8034191 in the chromosome 15q region (p<0.001). All significant associations were replicated at p<0.05 in the validation sample (12/12 associations with clinical measures and 2/2 genetic associations). Interpretation: Cluster analysis identifies four subgroups of smokers that show robust associations with clinical characteristics of COPD and known COPD-associated genetic variants.

Original languageEnglish (US)
Pages (from-to)415-422
Number of pages8
JournalThorax
Volume69
Issue number5
DOIs
StatePublished - May 2014

Fingerprint

Emphysema
Chronic Obstructive Pulmonary Disease
Cluster Analysis
Dyspnea
Chromosomes
Smoking
Genome

ASJC Scopus subject areas

  • Pulmonary and Respiratory Medicine

Cite this

Castaldi, P. J., Dy, J., Ross, J., Chang, Y., Washko, G. R., Curran-Everett, D., ... Cho, M. H. (2014). Cluster analysis in the COPDGene study identifies subtypes of smokers with distinct patterns of airway disease and emphysema. Thorax, 69(5), 415-422. https://doi.org/10.1136/thoraxjnl-2013-203601

Cluster analysis in the COPDGene study identifies subtypes of smokers with distinct patterns of airway disease and emphysema. / Castaldi, Peter J.; Dy, Jennifer; Ross, James; Chang, Yale; Washko, George R.; Curran-Everett, Douglas; Williams, Andre; Lynch, David A.; Make, Barry J.; Crapo, James D.; Bowler, Russ P.; Regan, Elizabeth A.; Hokanson, John E.; Kinney, Greg L.; Han, Meilan K.; Soler, Xavier; Ramsdell, Joseph W.; Barr, R. Graham; Foreman, Marilyn; Van Beek, Edwin; Casaburi, Richard; Criner, Gerald J.; Lutz, Sharon M.; Rennard, Steven I.; Santorico, Stephanie; Sciurba, Frank C.; Demeo, Dawn L.; Hersh, Craig P.; Silverman, Edwin K.; Cho, Michael H.

In: Thorax, Vol. 69, No. 5, 05.2014, p. 415-422.

Research output: Contribution to journalArticle

Castaldi, PJ, Dy, J, Ross, J, Chang, Y, Washko, GR, Curran-Everett, D, Williams, A, Lynch, DA, Make, BJ, Crapo, JD, Bowler, RP, Regan, EA, Hokanson, JE, Kinney, GL, Han, MK, Soler, X, Ramsdell, JW, Barr, RG, Foreman, M, Van Beek, E, Casaburi, R, Criner, GJ, Lutz, SM, Rennard, SI, Santorico, S, Sciurba, FC, Demeo, DL, Hersh, CP, Silverman, EK & Cho, MH 2014, 'Cluster analysis in the COPDGene study identifies subtypes of smokers with distinct patterns of airway disease and emphysema', Thorax, vol. 69, no. 5, pp. 415-422. https://doi.org/10.1136/thoraxjnl-2013-203601
Castaldi, Peter J. ; Dy, Jennifer ; Ross, James ; Chang, Yale ; Washko, George R. ; Curran-Everett, Douglas ; Williams, Andre ; Lynch, David A. ; Make, Barry J. ; Crapo, James D. ; Bowler, Russ P. ; Regan, Elizabeth A. ; Hokanson, John E. ; Kinney, Greg L. ; Han, Meilan K. ; Soler, Xavier ; Ramsdell, Joseph W. ; Barr, R. Graham ; Foreman, Marilyn ; Van Beek, Edwin ; Casaburi, Richard ; Criner, Gerald J. ; Lutz, Sharon M. ; Rennard, Steven I. ; Santorico, Stephanie ; Sciurba, Frank C. ; Demeo, Dawn L. ; Hersh, Craig P. ; Silverman, Edwin K. ; Cho, Michael H. / Cluster analysis in the COPDGene study identifies subtypes of smokers with distinct patterns of airway disease and emphysema. In: Thorax. 2014 ; Vol. 69, No. 5. pp. 415-422.
@article{60e9d00d37f84f249a82fcfca98d099c,
title = "Cluster analysis in the COPDGene study identifies subtypes of smokers with distinct patterns of airway disease and emphysema",
abstract = "Background: There is notable heterogeneity in the clinical presentation of patients with COPD. To characterise this heterogeneity, we sought to identify subgroups of smokers by applying cluster analysis to data from the COPDGene study. Methods: We applied a clustering method, k-means, to data from 10 192 smokers in the COPDGene study. After splitting the sample into a training and validation set, we evaluated three sets of input features across a range of k (user-specified number of clusters). Stable solutions were tested for association with four COPD-related measures and five genetic variants previously associated with COPD at genome-wide significance. The results were confirmed in the validation set. Findings: We identified four clusters that can be characterised as (1) relatively resistant smokers (ie, no/mild obstruction and minimal emphysema despite heavy smoking), (2) mild upper zone emphysema-predominant, (3) airway disease-predominant and (4) severe emphysema. All clusters are strongly associated with COPD-related clinical characteristics, including exacerbations and dyspnoea (p<0.001). We found strong genetic associations between the mild upper zone emphysema group and rs1980057 near HHIP, and between the severe emphysema group and rs8034191 in the chromosome 15q region (p<0.001). All significant associations were replicated at p<0.05 in the validation sample (12/12 associations with clinical measures and 2/2 genetic associations). Interpretation: Cluster analysis identifies four subgroups of smokers that show robust associations with clinical characteristics of COPD and known COPD-associated genetic variants.",
author = "Castaldi, {Peter J.} and Jennifer Dy and James Ross and Yale Chang and Washko, {George R.} and Douglas Curran-Everett and Andre Williams and Lynch, {David A.} and Make, {Barry J.} and Crapo, {James D.} and Bowler, {Russ P.} and Regan, {Elizabeth A.} and Hokanson, {John E.} and Kinney, {Greg L.} and Han, {Meilan K.} and Xavier Soler and Ramsdell, {Joseph W.} and Barr, {R. Graham} and Marilyn Foreman and {Van Beek}, Edwin and Richard Casaburi and Criner, {Gerald J.} and Lutz, {Sharon M.} and Rennard, {Steven I.} and Stephanie Santorico and Sciurba, {Frank C.} and Demeo, {Dawn L.} and Hersh, {Craig P.} and Silverman, {Edwin K.} and Cho, {Michael H.}",
year = "2014",
month = "5",
doi = "10.1136/thoraxjnl-2013-203601",
language = "English (US)",
volume = "69",
pages = "415--422",
journal = "Thorax",
issn = "0040-6376",
publisher = "BMJ Publishing Group",
number = "5",

}

TY - JOUR

T1 - Cluster analysis in the COPDGene study identifies subtypes of smokers with distinct patterns of airway disease and emphysema

AU - Castaldi, Peter J.

AU - Dy, Jennifer

AU - Ross, James

AU - Chang, Yale

AU - Washko, George R.

AU - Curran-Everett, Douglas

AU - Williams, Andre

AU - Lynch, David A.

AU - Make, Barry J.

AU - Crapo, James D.

AU - Bowler, Russ P.

AU - Regan, Elizabeth A.

AU - Hokanson, John E.

AU - Kinney, Greg L.

AU - Han, Meilan K.

AU - Soler, Xavier

AU - Ramsdell, Joseph W.

AU - Barr, R. Graham

AU - Foreman, Marilyn

AU - Van Beek, Edwin

AU - Casaburi, Richard

AU - Criner, Gerald J.

AU - Lutz, Sharon M.

AU - Rennard, Steven I.

AU - Santorico, Stephanie

AU - Sciurba, Frank C.

AU - Demeo, Dawn L.

AU - Hersh, Craig P.

AU - Silverman, Edwin K.

AU - Cho, Michael H.

PY - 2014/5

Y1 - 2014/5

N2 - Background: There is notable heterogeneity in the clinical presentation of patients with COPD. To characterise this heterogeneity, we sought to identify subgroups of smokers by applying cluster analysis to data from the COPDGene study. Methods: We applied a clustering method, k-means, to data from 10 192 smokers in the COPDGene study. After splitting the sample into a training and validation set, we evaluated three sets of input features across a range of k (user-specified number of clusters). Stable solutions were tested for association with four COPD-related measures and five genetic variants previously associated with COPD at genome-wide significance. The results were confirmed in the validation set. Findings: We identified four clusters that can be characterised as (1) relatively resistant smokers (ie, no/mild obstruction and minimal emphysema despite heavy smoking), (2) mild upper zone emphysema-predominant, (3) airway disease-predominant and (4) severe emphysema. All clusters are strongly associated with COPD-related clinical characteristics, including exacerbations and dyspnoea (p<0.001). We found strong genetic associations between the mild upper zone emphysema group and rs1980057 near HHIP, and between the severe emphysema group and rs8034191 in the chromosome 15q region (p<0.001). All significant associations were replicated at p<0.05 in the validation sample (12/12 associations with clinical measures and 2/2 genetic associations). Interpretation: Cluster analysis identifies four subgroups of smokers that show robust associations with clinical characteristics of COPD and known COPD-associated genetic variants.

AB - Background: There is notable heterogeneity in the clinical presentation of patients with COPD. To characterise this heterogeneity, we sought to identify subgroups of smokers by applying cluster analysis to data from the COPDGene study. Methods: We applied a clustering method, k-means, to data from 10 192 smokers in the COPDGene study. After splitting the sample into a training and validation set, we evaluated three sets of input features across a range of k (user-specified number of clusters). Stable solutions were tested for association with four COPD-related measures and five genetic variants previously associated with COPD at genome-wide significance. The results were confirmed in the validation set. Findings: We identified four clusters that can be characterised as (1) relatively resistant smokers (ie, no/mild obstruction and minimal emphysema despite heavy smoking), (2) mild upper zone emphysema-predominant, (3) airway disease-predominant and (4) severe emphysema. All clusters are strongly associated with COPD-related clinical characteristics, including exacerbations and dyspnoea (p<0.001). We found strong genetic associations between the mild upper zone emphysema group and rs1980057 near HHIP, and between the severe emphysema group and rs8034191 in the chromosome 15q region (p<0.001). All significant associations were replicated at p<0.05 in the validation sample (12/12 associations with clinical measures and 2/2 genetic associations). Interpretation: Cluster analysis identifies four subgroups of smokers that show robust associations with clinical characteristics of COPD and known COPD-associated genetic variants.

UR - http://www.scopus.com/inward/record.url?scp=84898902182&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84898902182&partnerID=8YFLogxK

U2 - 10.1136/thoraxjnl-2013-203601

DO - 10.1136/thoraxjnl-2013-203601

M3 - Article

C2 - 24563194

AN - SCOPUS:84898902182

VL - 69

SP - 415

EP - 422

JO - Thorax

JF - Thorax

SN - 0040-6376

IS - 5

ER -