ChimeRScope: A novel alignment-free algorithm for fusion transcript prediction using paired-end RNA-Seq data

You Li, Tayla B. Heavican, Neetha N. Vellichirammal, Javeed Iqbal, Chittibabu Guda

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

The RNA-Seq technology has revolutionized transcriptome characterization not only by accurately quantifying gene expression, but also by the identification of novel transcripts like chimeric fusion transcripts. The 'fusion' or 'chimeric' transcripts have improved the diagnosis and prognosis of several tumors, and have led to the development of novel therapeutic regimen. The fusion transcript detection is currently accomplished by several software packages, primarily relying on sequence alignment algorithms. The alignment of sequencing reads from fusion transcript loci in cancer genomes can be highly challenging due to the incorrect mapping induced by genomic alterations, thereby limiting the performance of alignment-based fusion transcript detection methods. Here, we developed a novel alignmentfree method, ChimeRScope that accurately predicts fusion transcripts based on the gene fingerprint (as k-mers) profiles of the RNA-Seq paired-end reads. Results on published datasets and in-house cancer cell line datasets followed by experimental validations demonstrate that ChimeRScope consistently outperforms other popular methods irrespective of the read lengths and sequencing depth. More importantly, results on our in-house datasets show that ChimeRScope is a better tool that is capable of identifying novel fusion transcripts with potential oncogenic functions. ChimeRScope is accessible as a standalone software at (https://github.com/ChimeRScope/ChimeRScope/wiki) or via the Galaxy web-interface at (https://galaxy.unmc.edu/).

Original languageEnglish (US)
Article numbere120
JournalNucleic acids research
Volume45
Issue number13
DOIs
StatePublished - Jul 1 2017

Fingerprint

RNA
Galaxies
Software
Neoplasms
Sequence Alignment
Dermatoglyphics
Transcriptome
Genome
Technology
Gene Expression
Cell Line
Genes
Datasets
Therapeutics

ASJC Scopus subject areas

  • Genetics

Cite this

ChimeRScope : A novel alignment-free algorithm for fusion transcript prediction using paired-end RNA-Seq data. / Li, You; Heavican, Tayla B.; Vellichirammal, Neetha N.; Iqbal, Javeed; Guda, Chittibabu.

In: Nucleic acids research, Vol. 45, No. 13, e120, 01.07.2017.

Research output: Contribution to journalArticle

@article{ea6580914cf5442c9dbf8e4db460f053,
title = "ChimeRScope: A novel alignment-free algorithm for fusion transcript prediction using paired-end RNA-Seq data",
abstract = "The RNA-Seq technology has revolutionized transcriptome characterization not only by accurately quantifying gene expression, but also by the identification of novel transcripts like chimeric fusion transcripts. The 'fusion' or 'chimeric' transcripts have improved the diagnosis and prognosis of several tumors, and have led to the development of novel therapeutic regimen. The fusion transcript detection is currently accomplished by several software packages, primarily relying on sequence alignment algorithms. The alignment of sequencing reads from fusion transcript loci in cancer genomes can be highly challenging due to the incorrect mapping induced by genomic alterations, thereby limiting the performance of alignment-based fusion transcript detection methods. Here, we developed a novel alignmentfree method, ChimeRScope that accurately predicts fusion transcripts based on the gene fingerprint (as k-mers) profiles of the RNA-Seq paired-end reads. Results on published datasets and in-house cancer cell line datasets followed by experimental validations demonstrate that ChimeRScope consistently outperforms other popular methods irrespective of the read lengths and sequencing depth. More importantly, results on our in-house datasets show that ChimeRScope is a better tool that is capable of identifying novel fusion transcripts with potential oncogenic functions. ChimeRScope is accessible as a standalone software at (https://github.com/ChimeRScope/ChimeRScope/wiki) or via the Galaxy web-interface at (https://galaxy.unmc.edu/).",
author = "You Li and Heavican, {Tayla B.} and Vellichirammal, {Neetha N.} and Javeed Iqbal and Chittibabu Guda",
year = "2017",
month = "7",
day = "1",
doi = "10.1093/nar/gkx315",
language = "English (US)",
volume = "45",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "13",

}

TY - JOUR

T1 - ChimeRScope

T2 - A novel alignment-free algorithm for fusion transcript prediction using paired-end RNA-Seq data

AU - Li, You

AU - Heavican, Tayla B.

AU - Vellichirammal, Neetha N.

AU - Iqbal, Javeed

AU - Guda, Chittibabu

PY - 2017/7/1

Y1 - 2017/7/1

N2 - The RNA-Seq technology has revolutionized transcriptome characterization not only by accurately quantifying gene expression, but also by the identification of novel transcripts like chimeric fusion transcripts. The 'fusion' or 'chimeric' transcripts have improved the diagnosis and prognosis of several tumors, and have led to the development of novel therapeutic regimen. The fusion transcript detection is currently accomplished by several software packages, primarily relying on sequence alignment algorithms. The alignment of sequencing reads from fusion transcript loci in cancer genomes can be highly challenging due to the incorrect mapping induced by genomic alterations, thereby limiting the performance of alignment-based fusion transcript detection methods. Here, we developed a novel alignmentfree method, ChimeRScope that accurately predicts fusion transcripts based on the gene fingerprint (as k-mers) profiles of the RNA-Seq paired-end reads. Results on published datasets and in-house cancer cell line datasets followed by experimental validations demonstrate that ChimeRScope consistently outperforms other popular methods irrespective of the read lengths and sequencing depth. More importantly, results on our in-house datasets show that ChimeRScope is a better tool that is capable of identifying novel fusion transcripts with potential oncogenic functions. ChimeRScope is accessible as a standalone software at (https://github.com/ChimeRScope/ChimeRScope/wiki) or via the Galaxy web-interface at (https://galaxy.unmc.edu/).

AB - The RNA-Seq technology has revolutionized transcriptome characterization not only by accurately quantifying gene expression, but also by the identification of novel transcripts like chimeric fusion transcripts. The 'fusion' or 'chimeric' transcripts have improved the diagnosis and prognosis of several tumors, and have led to the development of novel therapeutic regimen. The fusion transcript detection is currently accomplished by several software packages, primarily relying on sequence alignment algorithms. The alignment of sequencing reads from fusion transcript loci in cancer genomes can be highly challenging due to the incorrect mapping induced by genomic alterations, thereby limiting the performance of alignment-based fusion transcript detection methods. Here, we developed a novel alignmentfree method, ChimeRScope that accurately predicts fusion transcripts based on the gene fingerprint (as k-mers) profiles of the RNA-Seq paired-end reads. Results on published datasets and in-house cancer cell line datasets followed by experimental validations demonstrate that ChimeRScope consistently outperforms other popular methods irrespective of the read lengths and sequencing depth. More importantly, results on our in-house datasets show that ChimeRScope is a better tool that is capable of identifying novel fusion transcripts with potential oncogenic functions. ChimeRScope is accessible as a standalone software at (https://github.com/ChimeRScope/ChimeRScope/wiki) or via the Galaxy web-interface at (https://galaxy.unmc.edu/).

UR - http://www.scopus.com/inward/record.url?scp=85026439390&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85026439390&partnerID=8YFLogxK

U2 - 10.1093/nar/gkx315

DO - 10.1093/nar/gkx315

M3 - Article

C2 - 28472320

AN - SCOPUS:85026439390

VL - 45

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 13

M1 - e120

ER -