Genome-wide discriminatory information patterns of cytosine DNA methylation

Robersy Sanchez, Sally Mackenzie

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Cytosine DNA methylation (CDM) is a highly abundant, heritable but reversible chemical modification to the genome. Herein, a machine learning approach was applied to analyze the accumulation of epigenetic marks in methylomes of 152 ecotypes and 85 silencing mutants of Arabidopsis thaliana. In an information-thermodynamics framework, two measurements were used: (1) the amount of information gained/lost with the CDM changes IR and (2) the uncertainty of not observing a SNP LCR. We hypothesize that epigenetic marks are chromosomal footprints accounting for different ontogenetic and phylogenetic histories of individual populations. A machine learning approach is proposed to verify this hypothesis. Results support the hypothesis by the existence of discriminatory information (DI) patterns of CDM able to discriminate between individuals and between individual subpopulations. The statistical analyses revealed a strong association between the topologies of the structured population of Arabidopsis ecotypes based on IR and on LCR, respectively. A statistical-physical relationship between IR and LCR was also found. Results to date imply that the genome-wide distribution of CDM changes is not only part of the biological signal created by the methylation regulatory machinery, but ensures the stability of the DNA molecule, preserving the integrity of the genetic message under continuous stress from thermal fluctuations in the cell environment.

Original languageEnglish (US)
Article number938
JournalInternational journal of molecular sciences
Volume17
Issue number6
DOIs
StatePublished - Jun 1 2016

Fingerprint

methylation
genome
Cytosine
DNA Methylation
deoxyribonucleic acid
Genes
Genome
Ecotype
Arabidopsis
machine learning
Epigenomics
Learning systems
Methylation
Chemical modification
Thermodynamics
Population
Uncertainty
Machinery
Single Nucleotide Polymorphism
machinery

Keywords

  • Epigenetics
  • Epigenomics
  • Information thermodynamics
  • Linear discriminant analysis
  • Machine learning

ASJC Scopus subject areas

  • Catalysis
  • Molecular Biology
  • Spectroscopy
  • Computer Science Applications
  • Physical and Theoretical Chemistry
  • Organic Chemistry
  • Inorganic Chemistry

Cite this

Genome-wide discriminatory information patterns of cytosine DNA methylation. / Sanchez, Robersy; Mackenzie, Sally.

In: International journal of molecular sciences, Vol. 17, No. 6, 938, 01.06.2016.

Research output: Contribution to journalArticle

@article{9f90f85744044647b00a95588941f710,
title = "Genome-wide discriminatory information patterns of cytosine DNA methylation",
abstract = "Cytosine DNA methylation (CDM) is a highly abundant, heritable but reversible chemical modification to the genome. Herein, a machine learning approach was applied to analyze the accumulation of epigenetic marks in methylomes of 152 ecotypes and 85 silencing mutants of Arabidopsis thaliana. In an information-thermodynamics framework, two measurements were used: (1) the amount of information gained/lost with the CDM changes IR and (2) the uncertainty of not observing a SNP LCR. We hypothesize that epigenetic marks are chromosomal footprints accounting for different ontogenetic and phylogenetic histories of individual populations. A machine learning approach is proposed to verify this hypothesis. Results support the hypothesis by the existence of discriminatory information (DI) patterns of CDM able to discriminate between individuals and between individual subpopulations. The statistical analyses revealed a strong association between the topologies of the structured population of Arabidopsis ecotypes based on IR and on LCR, respectively. A statistical-physical relationship between IR and LCR was also found. Results to date imply that the genome-wide distribution of CDM changes is not only part of the biological signal created by the methylation regulatory machinery, but ensures the stability of the DNA molecule, preserving the integrity of the genetic message under continuous stress from thermal fluctuations in the cell environment.",
keywords = "Epigenetics, Epigenomics, Information thermodynamics, Linear discriminant analysis, Machine learning",
author = "Robersy Sanchez and Sally Mackenzie",
year = "2016",
month = "6",
day = "1",
doi = "10.3390/ijms17060938",
language = "English (US)",
volume = "17",
journal = "International Journal of Molecular Sciences",
issn = "1661-6596",
publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",
number = "6",

}

TY - JOUR

T1 - Genome-wide discriminatory information patterns of cytosine DNA methylation

AU - Sanchez, Robersy

AU - Mackenzie, Sally

PY - 2016/6/1

Y1 - 2016/6/1

N2 - Cytosine DNA methylation (CDM) is a highly abundant, heritable but reversible chemical modification to the genome. Herein, a machine learning approach was applied to analyze the accumulation of epigenetic marks in methylomes of 152 ecotypes and 85 silencing mutants of Arabidopsis thaliana. In an information-thermodynamics framework, two measurements were used: (1) the amount of information gained/lost with the CDM changes IR and (2) the uncertainty of not observing a SNP LCR. We hypothesize that epigenetic marks are chromosomal footprints accounting for different ontogenetic and phylogenetic histories of individual populations. A machine learning approach is proposed to verify this hypothesis. Results support the hypothesis by the existence of discriminatory information (DI) patterns of CDM able to discriminate between individuals and between individual subpopulations. The statistical analyses revealed a strong association between the topologies of the structured population of Arabidopsis ecotypes based on IR and on LCR, respectively. A statistical-physical relationship between IR and LCR was also found. Results to date imply that the genome-wide distribution of CDM changes is not only part of the biological signal created by the methylation regulatory machinery, but ensures the stability of the DNA molecule, preserving the integrity of the genetic message under continuous stress from thermal fluctuations in the cell environment.

AB - Cytosine DNA methylation (CDM) is a highly abundant, heritable but reversible chemical modification to the genome. Herein, a machine learning approach was applied to analyze the accumulation of epigenetic marks in methylomes of 152 ecotypes and 85 silencing mutants of Arabidopsis thaliana. In an information-thermodynamics framework, two measurements were used: (1) the amount of information gained/lost with the CDM changes IR and (2) the uncertainty of not observing a SNP LCR. We hypothesize that epigenetic marks are chromosomal footprints accounting for different ontogenetic and phylogenetic histories of individual populations. A machine learning approach is proposed to verify this hypothesis. Results support the hypothesis by the existence of discriminatory information (DI) patterns of CDM able to discriminate between individuals and between individual subpopulations. The statistical analyses revealed a strong association between the topologies of the structured population of Arabidopsis ecotypes based on IR and on LCR, respectively. A statistical-physical relationship between IR and LCR was also found. Results to date imply that the genome-wide distribution of CDM changes is not only part of the biological signal created by the methylation regulatory machinery, but ensures the stability of the DNA molecule, preserving the integrity of the genetic message under continuous stress from thermal fluctuations in the cell environment.

KW - Epigenetics

KW - Epigenomics

KW - Information thermodynamics

KW - Linear discriminant analysis

KW - Machine learning

UR - http://www.scopus.com/inward/record.url?scp=84975110717&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84975110717&partnerID=8YFLogxK

U2 - 10.3390/ijms17060938

DO - 10.3390/ijms17060938

M3 - Article

C2 - 27322251

AN - SCOPUS:84975110717

VL - 17

JO - International Journal of Molecular Sciences

JF - International Journal of Molecular Sciences

SN - 1661-6596

IS - 6

M1 - 938

ER -