Analysis of metabolomic PCA data using tree diagrams

Mark T. Werth, Steven Halouska, Matthew D. Shortridge, Bo Zhang, Robert Powers

Research output: Contribution to journalArticle

35 Citations (Scopus)

Abstract

Large amounts of data from high-throughput metabolomic experiments are commonly visualized using a principal component analysis (PCA) two-dimensional scores plot. The question of the similarity or difference between multiple metabolic states then becomes a question of the degree of overlap between their respective data point clusters in principal component (PC) scores space. A qualitative visual inspection of the clustering pattern in PCA scores plots is a common protocol. This article describes the application of tree diagrams and bootstrapping techniques for an improved quantitative analysis of metabolic PCA data clustering. Our PCAtoTree program creates a distance matrix with 100 bootstrap steps that describes the separation of all clusters in a metabolic data set. Using accepted phylogenetic software, the distance matrix resulting from the various metabolic states is organized into a phylogenetic-like tree format, where bootstrap values ≥50 indicate a statistically relevant branch separation. PCAtoTree analysis of two previously published data sets demonstrates the improved resolution of metabolic state differences using tree diagrams. In addition, for metabolomic studies of large numbers of different metabolic states, the tree format provides a better description of similarities and differences between each metabolic state. The approach is also tolerant of sample size variations between different metabolic states.

Original languageEnglish (US)
Pages (from-to)58-63
Number of pages6
JournalAnalytical Biochemistry
Volume399
Issue number1
DOIs
StatePublished - Apr 1 2010

Fingerprint

Metabolomics
Principal Component Analysis
Principal component analysis
Cluster Analysis
Sample Size
Software
Inspection
Throughput
Chemical analysis
Experiments
Datasets

Keywords

  • Bootstrap analysis
  • Metabolomics
  • NMR
  • Principal component analysis
  • Tree diagrams

ASJC Scopus subject areas

  • Biophysics
  • Biochemistry
  • Molecular Biology
  • Cell Biology

Cite this

Analysis of metabolomic PCA data using tree diagrams. / Werth, Mark T.; Halouska, Steven; Shortridge, Matthew D.; Zhang, Bo; Powers, Robert.

In: Analytical Biochemistry, Vol. 399, No. 1, 01.04.2010, p. 58-63.

Research output: Contribution to journalArticle

Werth, MT, Halouska, S, Shortridge, MD, Zhang, B & Powers, R 2010, 'Analysis of metabolomic PCA data using tree diagrams', Analytical Biochemistry, vol. 399, no. 1, pp. 58-63. https://doi.org/10.1016/j.ab.2009.12.022
Werth, Mark T. ; Halouska, Steven ; Shortridge, Matthew D. ; Zhang, Bo ; Powers, Robert. / Analysis of metabolomic PCA data using tree diagrams. In: Analytical Biochemistry. 2010 ; Vol. 399, No. 1. pp. 58-63.
@article{f46ec3e3b0fa4cc8a5e12ad0a54b480d,
title = "Analysis of metabolomic PCA data using tree diagrams",
abstract = "Large amounts of data from high-throughput metabolomic experiments are commonly visualized using a principal component analysis (PCA) two-dimensional scores plot. The question of the similarity or difference between multiple metabolic states then becomes a question of the degree of overlap between their respective data point clusters in principal component (PC) scores space. A qualitative visual inspection of the clustering pattern in PCA scores plots is a common protocol. This article describes the application of tree diagrams and bootstrapping techniques for an improved quantitative analysis of metabolic PCA data clustering. Our PCAtoTree program creates a distance matrix with 100 bootstrap steps that describes the separation of all clusters in a metabolic data set. Using accepted phylogenetic software, the distance matrix resulting from the various metabolic states is organized into a phylogenetic-like tree format, where bootstrap values ≥50 indicate a statistically relevant branch separation. PCAtoTree analysis of two previously published data sets demonstrates the improved resolution of metabolic state differences using tree diagrams. In addition, for metabolomic studies of large numbers of different metabolic states, the tree format provides a better description of similarities and differences between each metabolic state. The approach is also tolerant of sample size variations between different metabolic states.",
keywords = "Bootstrap analysis, Metabolomics, NMR, Principal component analysis, Tree diagrams",
author = "Werth, {Mark T.} and Steven Halouska and Shortridge, {Matthew D.} and Bo Zhang and Robert Powers",
year = "2010",
month = "4",
day = "1",
doi = "10.1016/j.ab.2009.12.022",
language = "English (US)",
volume = "399",
pages = "58--63",
journal = "Analytical Biochemistry",
issn = "0003-2697",
publisher = "Academic Press Inc.",
number = "1",

}

TY - JOUR

T1 - Analysis of metabolomic PCA data using tree diagrams

AU - Werth, Mark T.

AU - Halouska, Steven

AU - Shortridge, Matthew D.

AU - Zhang, Bo

AU - Powers, Robert

PY - 2010/4/1

Y1 - 2010/4/1

N2 - Large amounts of data from high-throughput metabolomic experiments are commonly visualized using a principal component analysis (PCA) two-dimensional scores plot. The question of the similarity or difference between multiple metabolic states then becomes a question of the degree of overlap between their respective data point clusters in principal component (PC) scores space. A qualitative visual inspection of the clustering pattern in PCA scores plots is a common protocol. This article describes the application of tree diagrams and bootstrapping techniques for an improved quantitative analysis of metabolic PCA data clustering. Our PCAtoTree program creates a distance matrix with 100 bootstrap steps that describes the separation of all clusters in a metabolic data set. Using accepted phylogenetic software, the distance matrix resulting from the various metabolic states is organized into a phylogenetic-like tree format, where bootstrap values ≥50 indicate a statistically relevant branch separation. PCAtoTree analysis of two previously published data sets demonstrates the improved resolution of metabolic state differences using tree diagrams. In addition, for metabolomic studies of large numbers of different metabolic states, the tree format provides a better description of similarities and differences between each metabolic state. The approach is also tolerant of sample size variations between different metabolic states.

AB - Large amounts of data from high-throughput metabolomic experiments are commonly visualized using a principal component analysis (PCA) two-dimensional scores plot. The question of the similarity or difference between multiple metabolic states then becomes a question of the degree of overlap between their respective data point clusters in principal component (PC) scores space. A qualitative visual inspection of the clustering pattern in PCA scores plots is a common protocol. This article describes the application of tree diagrams and bootstrapping techniques for an improved quantitative analysis of metabolic PCA data clustering. Our PCAtoTree program creates a distance matrix with 100 bootstrap steps that describes the separation of all clusters in a metabolic data set. Using accepted phylogenetic software, the distance matrix resulting from the various metabolic states is organized into a phylogenetic-like tree format, where bootstrap values ≥50 indicate a statistically relevant branch separation. PCAtoTree analysis of two previously published data sets demonstrates the improved resolution of metabolic state differences using tree diagrams. In addition, for metabolomic studies of large numbers of different metabolic states, the tree format provides a better description of similarities and differences between each metabolic state. The approach is also tolerant of sample size variations between different metabolic states.

KW - Bootstrap analysis

KW - Metabolomics

KW - NMR

KW - Principal component analysis

KW - Tree diagrams

UR - http://www.scopus.com/inward/record.url?scp=77649180718&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77649180718&partnerID=8YFLogxK

U2 - 10.1016/j.ab.2009.12.022

DO - 10.1016/j.ab.2009.12.022

M3 - Article

VL - 399

SP - 58

EP - 63

JO - Analytical Biochemistry

JF - Analytical Biochemistry

SN - 0003-2697

IS - 1

ER -