Negative impact of noise on the principal component analysis of NMR data

Steven Halouska, Robert Powers

Research output: Contribution to journalArticle

54 Citations (Scopus)

Abstract

Principal component analysis (PCA) is routinely applied to the study of NMR based metabolomic data. PCA is used to simplify the examination of complex metabolite mixtures obtained from biological samples that may be composed of hundreds or thousands of chemical components. PCA is primarily used to identify relative changes in the concentration of metabolites to identify trends or characteristics within the NMR data that permits discrimination between various samples that differ in their source or treatment. A common concern with PCA of NMR data is the potential over emphasis of small changes in high concentration metabolites that would over-shadow significant and large changes in low-concentration components that may lead to a skewed or irrelevant clustering of the NMR data. We have identified an additional concern, very small and random fluctuations within the noise of the NMR spectrum can also result in large and irrelevant variations in the PCA clustering. Alleviation of this problem is obtained by simply excluding the noise region from the PCA by a judicious choice of a threshold above the spectral noise.

Original languageEnglish (US)
Pages (from-to)88-95
Number of pages8
JournalJournal of Magnetic Resonance
Volume178
Issue number1
DOIs
StatePublished - Jan 1 2006

Fingerprint

principal components analysis
Principal Component Analysis
Principal component analysis
Noise
Nuclear magnetic resonance
nuclear magnetic resonance
metabolites
Metabolites
Cluster Analysis
Metabolomics
white noise
Complex Mixtures
discrimination
low concentrations
examination
trends
thresholds

Keywords

  • Impact of noise
  • Metabolomics
  • NMR
  • Principal component analysis

ASJC Scopus subject areas

  • Biophysics
  • Biochemistry
  • Nuclear and High Energy Physics
  • Condensed Matter Physics

Cite this

Negative impact of noise on the principal component analysis of NMR data. / Halouska, Steven; Powers, Robert.

In: Journal of Magnetic Resonance, Vol. 178, No. 1, 01.01.2006, p. 88-95.

Research output: Contribution to journalArticle

@article{2a0331e58029485ea3a30a22d51fb660,
title = "Negative impact of noise on the principal component analysis of NMR data",
abstract = "Principal component analysis (PCA) is routinely applied to the study of NMR based metabolomic data. PCA is used to simplify the examination of complex metabolite mixtures obtained from biological samples that may be composed of hundreds or thousands of chemical components. PCA is primarily used to identify relative changes in the concentration of metabolites to identify trends or characteristics within the NMR data that permits discrimination between various samples that differ in their source or treatment. A common concern with PCA of NMR data is the potential over emphasis of small changes in high concentration metabolites that would over-shadow significant and large changes in low-concentration components that may lead to a skewed or irrelevant clustering of the NMR data. We have identified an additional concern, very small and random fluctuations within the noise of the NMR spectrum can also result in large and irrelevant variations in the PCA clustering. Alleviation of this problem is obtained by simply excluding the noise region from the PCA by a judicious choice of a threshold above the spectral noise.",
keywords = "Impact of noise, Metabolomics, NMR, Principal component analysis",
author = "Steven Halouska and Robert Powers",
year = "2006",
month = "1",
day = "1",
doi = "10.1016/j.jmr.2005.08.016",
language = "English (US)",
volume = "178",
pages = "88--95",
journal = "Journal of Magnetic Resonance",
issn = "1090-7807",
publisher = "Academic Press Inc.",
number = "1",

}

TY - JOUR

T1 - Negative impact of noise on the principal component analysis of NMR data

AU - Halouska, Steven

AU - Powers, Robert

PY - 2006/1/1

Y1 - 2006/1/1

N2 - Principal component analysis (PCA) is routinely applied to the study of NMR based metabolomic data. PCA is used to simplify the examination of complex metabolite mixtures obtained from biological samples that may be composed of hundreds or thousands of chemical components. PCA is primarily used to identify relative changes in the concentration of metabolites to identify trends or characteristics within the NMR data that permits discrimination between various samples that differ in their source or treatment. A common concern with PCA of NMR data is the potential over emphasis of small changes in high concentration metabolites that would over-shadow significant and large changes in low-concentration components that may lead to a skewed or irrelevant clustering of the NMR data. We have identified an additional concern, very small and random fluctuations within the noise of the NMR spectrum can also result in large and irrelevant variations in the PCA clustering. Alleviation of this problem is obtained by simply excluding the noise region from the PCA by a judicious choice of a threshold above the spectral noise.

AB - Principal component analysis (PCA) is routinely applied to the study of NMR based metabolomic data. PCA is used to simplify the examination of complex metabolite mixtures obtained from biological samples that may be composed of hundreds or thousands of chemical components. PCA is primarily used to identify relative changes in the concentration of metabolites to identify trends or characteristics within the NMR data that permits discrimination between various samples that differ in their source or treatment. A common concern with PCA of NMR data is the potential over emphasis of small changes in high concentration metabolites that would over-shadow significant and large changes in low-concentration components that may lead to a skewed or irrelevant clustering of the NMR data. We have identified an additional concern, very small and random fluctuations within the noise of the NMR spectrum can also result in large and irrelevant variations in the PCA clustering. Alleviation of this problem is obtained by simply excluding the noise region from the PCA by a judicious choice of a threshold above the spectral noise.

KW - Impact of noise

KW - Metabolomics

KW - NMR

KW - Principal component analysis

UR - http://www.scopus.com/inward/record.url?scp=28844433017&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=28844433017&partnerID=8YFLogxK

U2 - 10.1016/j.jmr.2005.08.016

DO - 10.1016/j.jmr.2005.08.016

M3 - Article

C2 - 16198132

AN - SCOPUS:28844433017

VL - 178

SP - 88

EP - 95

JO - Journal of Magnetic Resonance

JF - Journal of Magnetic Resonance

SN - 1090-7807

IS - 1

ER -