An overview of the computational analyses and discovery of transcription factor binding sites.

Istvan Ladunga

Research output: Contribution to journalReview article

9 Citations (Scopus)

Abstract

Here we provide a pragmatic, high-level overview of the computational approaches and tools for the discovery of transcription factor binding sites. Unraveling transcription regulatory networks and their malfunctions such as cancer became feasible due to recent stellar progress in experimental techniques and computational analyses. While predictions of isolated sites still pose notorious challenges, cis-regulatory modules (clusters) of binding sites can now be identified with high accuracy. Further support comes from conserved DNA segments, co-regulation, transposable elements, nucleosomes, and three-dimensional chromosomal structures. We introduce computational tools for the analysis and interpretation of chromatin immunoprecipitation, next-generation sequencing, SELEX, and protein-binding microarray results. Because immunoprecipitation produces overly large DNA segments and well over half of the sequencing reads from constitute background noise, methods are presented for background correction, sequence read mapping, peak calling, false discovery rate estimation, and co-localization analyses. To discover short binding site motifs from extensive immunoprecipitation segments, we recommend algorithms and software based on expectation maximization and Gibbs sampling. Data integration using several databases further improves performance. Binding sites can be visualized in genomic and chromatin context using genome browsers. Binding site information, integrated with co-expression in large compendia of gene expression experiments, allows us to reveal complex transcriptional regulatory networks.

Original languageEnglish (US)
Pages (from-to)1-22
Number of pages22
JournalMethods in molecular biology (Clifton, N.J.)
Volume674
StatePublished - 2010

Fingerprint

Transcription Factors
Binding Sites
Immunoprecipitation
Protein Array Analysis
DNA Transposable Elements
Nucleosomes
Gene Regulatory Networks
Chromatin Immunoprecipitation
DNA
Protein Binding
Chromatin
Noise
Software
Genome
Databases
Gene Expression
Neoplasms

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics

Cite this

An overview of the computational analyses and discovery of transcription factor binding sites. / Ladunga, Istvan.

In: Methods in molecular biology (Clifton, N.J.), Vol. 674, 2010, p. 1-22.

Research output: Contribution to journalReview article

@article{64a91cb32e6f40edb2f6d6dc43ced577,
title = "An overview of the computational analyses and discovery of transcription factor binding sites.",
abstract = "Here we provide a pragmatic, high-level overview of the computational approaches and tools for the discovery of transcription factor binding sites. Unraveling transcription regulatory networks and their malfunctions such as cancer became feasible due to recent stellar progress in experimental techniques and computational analyses. While predictions of isolated sites still pose notorious challenges, cis-regulatory modules (clusters) of binding sites can now be identified with high accuracy. Further support comes from conserved DNA segments, co-regulation, transposable elements, nucleosomes, and three-dimensional chromosomal structures. We introduce computational tools for the analysis and interpretation of chromatin immunoprecipitation, next-generation sequencing, SELEX, and protein-binding microarray results. Because immunoprecipitation produces overly large DNA segments and well over half of the sequencing reads from constitute background noise, methods are presented for background correction, sequence read mapping, peak calling, false discovery rate estimation, and co-localization analyses. To discover short binding site motifs from extensive immunoprecipitation segments, we recommend algorithms and software based on expectation maximization and Gibbs sampling. Data integration using several databases further improves performance. Binding sites can be visualized in genomic and chromatin context using genome browsers. Binding site information, integrated with co-expression in large compendia of gene expression experiments, allows us to reveal complex transcriptional regulatory networks.",
author = "Istvan Ladunga",
year = "2010",
language = "English (US)",
volume = "674",
pages = "1--22",
journal = "Methods in molecular biology (Clifton, N.J.)",
issn = "1064-3745",
publisher = "Humana Press",

}

TY - JOUR

T1 - An overview of the computational analyses and discovery of transcription factor binding sites.

AU - Ladunga, Istvan

PY - 2010

Y1 - 2010

N2 - Here we provide a pragmatic, high-level overview of the computational approaches and tools for the discovery of transcription factor binding sites. Unraveling transcription regulatory networks and their malfunctions such as cancer became feasible due to recent stellar progress in experimental techniques and computational analyses. While predictions of isolated sites still pose notorious challenges, cis-regulatory modules (clusters) of binding sites can now be identified with high accuracy. Further support comes from conserved DNA segments, co-regulation, transposable elements, nucleosomes, and three-dimensional chromosomal structures. We introduce computational tools for the analysis and interpretation of chromatin immunoprecipitation, next-generation sequencing, SELEX, and protein-binding microarray results. Because immunoprecipitation produces overly large DNA segments and well over half of the sequencing reads from constitute background noise, methods are presented for background correction, sequence read mapping, peak calling, false discovery rate estimation, and co-localization analyses. To discover short binding site motifs from extensive immunoprecipitation segments, we recommend algorithms and software based on expectation maximization and Gibbs sampling. Data integration using several databases further improves performance. Binding sites can be visualized in genomic and chromatin context using genome browsers. Binding site information, integrated with co-expression in large compendia of gene expression experiments, allows us to reveal complex transcriptional regulatory networks.

AB - Here we provide a pragmatic, high-level overview of the computational approaches and tools for the discovery of transcription factor binding sites. Unraveling transcription regulatory networks and their malfunctions such as cancer became feasible due to recent stellar progress in experimental techniques and computational analyses. While predictions of isolated sites still pose notorious challenges, cis-regulatory modules (clusters) of binding sites can now be identified with high accuracy. Further support comes from conserved DNA segments, co-regulation, transposable elements, nucleosomes, and three-dimensional chromosomal structures. We introduce computational tools for the analysis and interpretation of chromatin immunoprecipitation, next-generation sequencing, SELEX, and protein-binding microarray results. Because immunoprecipitation produces overly large DNA segments and well over half of the sequencing reads from constitute background noise, methods are presented for background correction, sequence read mapping, peak calling, false discovery rate estimation, and co-localization analyses. To discover short binding site motifs from extensive immunoprecipitation segments, we recommend algorithms and software based on expectation maximization and Gibbs sampling. Data integration using several databases further improves performance. Binding sites can be visualized in genomic and chromatin context using genome browsers. Binding site information, integrated with co-expression in large compendia of gene expression experiments, allows us to reveal complex transcriptional regulatory networks.

UR - http://www.scopus.com/inward/record.url?scp=79952110277&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952110277&partnerID=8YFLogxK

M3 - Review article

C2 - 20827582

AN - SCOPUS:79952110277

VL - 674

SP - 1

EP - 22

JO - Methods in molecular biology (Clifton, N.J.)

JF - Methods in molecular biology (Clifton, N.J.)

SN - 1064-3745

ER -