Comparing and optimizing transcriptome assembly pipeline for diploid wheat

Natasha Pavlovikj, Kevin Begcy, Sairam Behera, Malachy Campbell, Harkamal Walia, Jitender S. Deogun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Gene expression and transcriptome analysis are currently one of the main focuses of research for a great number of scientists. However, the assembly of raw sequence data to obtain a draft transcriptome of an organism is a complex multi-stage process usually composed of preprocessing, assembling, and postprocessing. Each of these stages includes multiple steps such as data cleaning, contaminant removal, error correction and assembly validation. In order to implement all these steps, a great knowledge of different algorithms, various bioinformatics tools and software is required. In this paper, we generate multiple transcriptome assembly pipelines by using different tools and approaches in the process. Analyzing these pipelines, we can observe that using the error correction method with Velvet Oases and merging the individual k-mer assemblies with highest N50 produce the most stable base for further transcriptome biological analysis.

Original languageEnglish (US)
Title of host publicationACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
PublisherAssociation for Computing Machinery, Inc
Pages603-604
Number of pages2
ISBN (Electronic)9781450328944
DOIs
StatePublished - Sep 20 2014
Event5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014 - Newport Beach, United States
Duration: Sep 20 2014Sep 23 2014

Publication series

NameACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Conference

Conference5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014
CountryUnited States
CityNewport Beach
Period9/20/149/23/14

Fingerprint

Gene Expression Profiling
Diploidy
Transcriptome
Triticum
Pipelines
Error correction
Computational Biology
Software
Gene Expression
Bioinformatics
Merging
Research
Gene expression
Cleaning
Impurities

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications
  • Software
  • Biomedical Engineering

Cite this

Pavlovikj, N., Begcy, K., Behera, S., Campbell, M., Walia, H., & Deogun, J. S. (2014). Comparing and optimizing transcriptome assembly pipeline for diploid wheat. In ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (pp. 603-604). (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics). Association for Computing Machinery, Inc. https://doi.org/10.1145/2649387.2662450

Comparing and optimizing transcriptome assembly pipeline for diploid wheat. / Pavlovikj, Natasha; Begcy, Kevin; Behera, Sairam; Campbell, Malachy; Walia, Harkamal; Deogun, Jitender S.

ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, Inc, 2014. p. 603-604 (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Pavlovikj, N, Begcy, K, Behera, S, Campbell, M, Walia, H & Deogun, JS 2014, Comparing and optimizing transcriptome assembly pipeline for diploid wheat. in ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, Association for Computing Machinery, Inc, pp. 603-604, 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014, Newport Beach, United States, 9/20/14. https://doi.org/10.1145/2649387.2662450
Pavlovikj N, Begcy K, Behera S, Campbell M, Walia H, Deogun JS. Comparing and optimizing transcriptome assembly pipeline for diploid wheat. In ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, Inc. 2014. p. 603-604. (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics). https://doi.org/10.1145/2649387.2662450
Pavlovikj, Natasha ; Begcy, Kevin ; Behera, Sairam ; Campbell, Malachy ; Walia, Harkamal ; Deogun, Jitender S. / Comparing and optimizing transcriptome assembly pipeline for diploid wheat. ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, Inc, 2014. pp. 603-604 (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics).
@inproceedings{f2e1fbacd30841b2b35cc1a9cf6eb0fb,
title = "Comparing and optimizing transcriptome assembly pipeline for diploid wheat",
abstract = "Gene expression and transcriptome analysis are currently one of the main focuses of research for a great number of scientists. However, the assembly of raw sequence data to obtain a draft transcriptome of an organism is a complex multi-stage process usually composed of preprocessing, assembling, and postprocessing. Each of these stages includes multiple steps such as data cleaning, contaminant removal, error correction and assembly validation. In order to implement all these steps, a great knowledge of different algorithms, various bioinformatics tools and software is required. In this paper, we generate multiple transcriptome assembly pipelines by using different tools and approaches in the process. Analyzing these pipelines, we can observe that using the error correction method with Velvet Oases and merging the individual k-mer assemblies with highest N50 produce the most stable base for further transcriptome biological analysis.",
author = "Natasha Pavlovikj and Kevin Begcy and Sairam Behera and Malachy Campbell and Harkamal Walia and Deogun, {Jitender S.}",
year = "2014",
month = "9",
day = "20",
doi = "10.1145/2649387.2662450",
language = "English (US)",
series = "ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics",
publisher = "Association for Computing Machinery, Inc",
pages = "603--604",
booktitle = "ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics",

}

TY - GEN

T1 - Comparing and optimizing transcriptome assembly pipeline for diploid wheat

AU - Pavlovikj, Natasha

AU - Begcy, Kevin

AU - Behera, Sairam

AU - Campbell, Malachy

AU - Walia, Harkamal

AU - Deogun, Jitender S.

PY - 2014/9/20

Y1 - 2014/9/20

N2 - Gene expression and transcriptome analysis are currently one of the main focuses of research for a great number of scientists. However, the assembly of raw sequence data to obtain a draft transcriptome of an organism is a complex multi-stage process usually composed of preprocessing, assembling, and postprocessing. Each of these stages includes multiple steps such as data cleaning, contaminant removal, error correction and assembly validation. In order to implement all these steps, a great knowledge of different algorithms, various bioinformatics tools and software is required. In this paper, we generate multiple transcriptome assembly pipelines by using different tools and approaches in the process. Analyzing these pipelines, we can observe that using the error correction method with Velvet Oases and merging the individual k-mer assemblies with highest N50 produce the most stable base for further transcriptome biological analysis.

AB - Gene expression and transcriptome analysis are currently one of the main focuses of research for a great number of scientists. However, the assembly of raw sequence data to obtain a draft transcriptome of an organism is a complex multi-stage process usually composed of preprocessing, assembling, and postprocessing. Each of these stages includes multiple steps such as data cleaning, contaminant removal, error correction and assembly validation. In order to implement all these steps, a great knowledge of different algorithms, various bioinformatics tools and software is required. In this paper, we generate multiple transcriptome assembly pipelines by using different tools and approaches in the process. Analyzing these pipelines, we can observe that using the error correction method with Velvet Oases and merging the individual k-mer assemblies with highest N50 produce the most stable base for further transcriptome biological analysis.

UR - http://www.scopus.com/inward/record.url?scp=84920733670&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84920733670&partnerID=8YFLogxK

U2 - 10.1145/2649387.2662450

DO - 10.1145/2649387.2662450

M3 - Conference contribution

AN - SCOPUS:84920733670

T3 - ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

SP - 603

EP - 604

BT - ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

PB - Association for Computing Machinery, Inc

ER -