Next-generation transcriptome assembly and analysis

Impact of ploidy

Adam Voshall, Etsuko Moriyama

Research output: Contribution to journalArticle

Abstract

Whole genome duplications (WGD) occur widely in plants, but the effects of these events impact all branches of life. WGD events have major evolutionary impacts, often leading to major structural changes within the chromosomes and massive changes in gene expression that facilitate rapid speciation and gene diversification. Even for species that currently have diploid genomes, the impact of ancestral duplication events is still present in the genomes, especially in the context of highly similar gene families that are retained from WGD. However, the impact of these ploidies on various bioinformatics workflows has not been studied well. In this review, we overview biological significance of polyploidy in different organisms. We describe the impact of having polyploid transcriptomes on bioinformatics analyses, especially focusing on transcriptome assembly and transcript quantification. We discuss the benefits of using simulated benchmarking data when we examine the performance of various methods. We also present an example strategy to generate simulated allopolyploid transcriptomes and RNAseq datasets and how these benchmark datasets can be used to assess the performance of transcript assembly and quantification methods. Our benchmarking study shows that all transcriptome assembly methods are affected by having polyploid genomes. Quantification accuracy is also impacted by polyploidy depending on the method. These simulated datasets can be adapted for testing, such as, read mapping, variant calling, and differential expression using biologically realistic conditions.

Original languageEnglish (US)
JournalMethods
DOIs
StatePublished - Jan 1 2019

Fingerprint

Ploidies
Gene Expression Profiling
Polyploidy
Genes
Genome
Transcriptome
Benchmarking
Computational Biology
Bioinformatics
Workflow
Diploidy
Chromosomes
Gene expression
Gene Expression
Datasets
Testing

Keywords

  • Polyploidy
  • RNAseq
  • Simulation
  • Transcript quantification
  • Transcriptome assembly
  • Whole genome duplication

ASJC Scopus subject areas

  • Molecular Biology
  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

Next-generation transcriptome assembly and analysis : Impact of ploidy. / Voshall, Adam; Moriyama, Etsuko.

In: Methods, 01.01.2019.

Research output: Contribution to journalArticle

@article{bb6bb40949dc48a791759c1a2fd1d0a6,
title = "Next-generation transcriptome assembly and analysis: Impact of ploidy",
abstract = "Whole genome duplications (WGD) occur widely in plants, but the effects of these events impact all branches of life. WGD events have major evolutionary impacts, often leading to major structural changes within the chromosomes and massive changes in gene expression that facilitate rapid speciation and gene diversification. Even for species that currently have diploid genomes, the impact of ancestral duplication events is still present in the genomes, especially in the context of highly similar gene families that are retained from WGD. However, the impact of these ploidies on various bioinformatics workflows has not been studied well. In this review, we overview biological significance of polyploidy in different organisms. We describe the impact of having polyploid transcriptomes on bioinformatics analyses, especially focusing on transcriptome assembly and transcript quantification. We discuss the benefits of using simulated benchmarking data when we examine the performance of various methods. We also present an example strategy to generate simulated allopolyploid transcriptomes and RNAseq datasets and how these benchmark datasets can be used to assess the performance of transcript assembly and quantification methods. Our benchmarking study shows that all transcriptome assembly methods are affected by having polyploid genomes. Quantification accuracy is also impacted by polyploidy depending on the method. These simulated datasets can be adapted for testing, such as, read mapping, variant calling, and differential expression using biologically realistic conditions.",
keywords = "Polyploidy, RNAseq, Simulation, Transcript quantification, Transcriptome assembly, Whole genome duplication",
author = "Adam Voshall and Etsuko Moriyama",
year = "2019",
month = "1",
day = "1",
doi = "10.1016/j.ymeth.2019.06.001",
language = "English (US)",
journal = "ImmunoMethods",
issn = "1046-2023",
publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Next-generation transcriptome assembly and analysis

T2 - Impact of ploidy

AU - Voshall, Adam

AU - Moriyama, Etsuko

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Whole genome duplications (WGD) occur widely in plants, but the effects of these events impact all branches of life. WGD events have major evolutionary impacts, often leading to major structural changes within the chromosomes and massive changes in gene expression that facilitate rapid speciation and gene diversification. Even for species that currently have diploid genomes, the impact of ancestral duplication events is still present in the genomes, especially in the context of highly similar gene families that are retained from WGD. However, the impact of these ploidies on various bioinformatics workflows has not been studied well. In this review, we overview biological significance of polyploidy in different organisms. We describe the impact of having polyploid transcriptomes on bioinformatics analyses, especially focusing on transcriptome assembly and transcript quantification. We discuss the benefits of using simulated benchmarking data when we examine the performance of various methods. We also present an example strategy to generate simulated allopolyploid transcriptomes and RNAseq datasets and how these benchmark datasets can be used to assess the performance of transcript assembly and quantification methods. Our benchmarking study shows that all transcriptome assembly methods are affected by having polyploid genomes. Quantification accuracy is also impacted by polyploidy depending on the method. These simulated datasets can be adapted for testing, such as, read mapping, variant calling, and differential expression using biologically realistic conditions.

AB - Whole genome duplications (WGD) occur widely in plants, but the effects of these events impact all branches of life. WGD events have major evolutionary impacts, often leading to major structural changes within the chromosomes and massive changes in gene expression that facilitate rapid speciation and gene diversification. Even for species that currently have diploid genomes, the impact of ancestral duplication events is still present in the genomes, especially in the context of highly similar gene families that are retained from WGD. However, the impact of these ploidies on various bioinformatics workflows has not been studied well. In this review, we overview biological significance of polyploidy in different organisms. We describe the impact of having polyploid transcriptomes on bioinformatics analyses, especially focusing on transcriptome assembly and transcript quantification. We discuss the benefits of using simulated benchmarking data when we examine the performance of various methods. We also present an example strategy to generate simulated allopolyploid transcriptomes and RNAseq datasets and how these benchmark datasets can be used to assess the performance of transcript assembly and quantification methods. Our benchmarking study shows that all transcriptome assembly methods are affected by having polyploid genomes. Quantification accuracy is also impacted by polyploidy depending on the method. These simulated datasets can be adapted for testing, such as, read mapping, variant calling, and differential expression using biologically realistic conditions.

KW - Polyploidy

KW - RNAseq

KW - Simulation

KW - Transcript quantification

KW - Transcriptome assembly

KW - Whole genome duplication

UR - http://www.scopus.com/inward/record.url?scp=85067944391&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067944391&partnerID=8YFLogxK

U2 - 10.1016/j.ymeth.2019.06.001

DO - 10.1016/j.ymeth.2019.06.001

M3 - Article

JO - ImmunoMethods

JF - ImmunoMethods

SN - 1046-2023

ER -