Computational prediction of novel non-coding RNAs in Arabidopsis thaliana

Dandan Song, Yang Yang, Bin Yu, Binglian Zheng, Zhidong Deng, Bao Liang Lu, Xuemei Chen, Tao Jiang

Research output: Contribution to journalArticle

33 Citations (Scopus)

Abstract

Background: Non-coding RNA (ncRNA) genes do not encode proteins but produce functional RNA molecules that play crucial roles in many key biological processes. Recent genome-wide transcriptional profiling studies using tiling arrays in organisms such as human and Arabidopsis have revealed a great number of transcripts, a large portion of which have little or no capability to encode proteins. This unexpected finding suggests that the currently known repertoire of ncRNAs may only represent a small fraction of ncRNAs of the organisms. Thus, efficient and effective prediction of ncRNAs has become an important task in bioinformatics in recent years. Among the available computational methods, the comparative genomic approach seems to be the most powerful to detect ncRNAs. The recent completion of the sequencing of several major plant genomes has made the approach possible for plants. Results: We have developed a pipeline to predict novel ncRNAs in the Arabidopsis (Arabidopsis thaliana) genome. It starts by comparing the expressed intergenic regions of Arabidopsis as provided in two whole-genome high-density oligo-probe arrays from the literature with the intergenic nucleotide sequences of all completely sequenced plant genomes including rice (Oryza sativa), poplar (Populus trichocarpa), grape (Vitis vinifera), and papaya (Carica papaya). By using multiple sequence alignment, a popular ncRNA prediction program (RNAz), wet-bench experimental validation, protein-coding potential analysis, and stringent screening against various ncRNA databases, the pipeline resulted in 16 families of novel ncRNAs (with a total of 21 ncRNAs). Conclusion: In this paper, we undertake a genome-wide search for novel ncRNAs in the genome of Arabidopsis by a comparative genomics approach. The identified novel ncRNAs are evolutionarily conserved between Arabidopsis and other recently sequenced plants, and may conduct interesting novel biological functions.

Original languageEnglish (US)
Article numberS36
JournalBMC bioinformatics
Volume10
Issue numberSUPPL. 1
DOIs
StatePublished - Jan 30 2009

Fingerprint

Arabidopsis Thaliana
Untranslated RNA
RNA
Arabidopsis
Genome
Genes
Prediction
Carica
Plant Genome
Intergenic DNA
Vitis
Comparative Genomics
Protein
Proteins
Populus
Biological Phenomena
Rice (Oryza sativa)
Pipelines
Sequence Alignment
Nucleic Acid Databases

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Song, D., Yang, Y., Yu, B., Zheng, B., Deng, Z., Lu, B. L., ... Jiang, T. (2009). Computational prediction of novel non-coding RNAs in Arabidopsis thaliana. BMC bioinformatics, 10(SUPPL. 1), [S36]. https://doi.org/10.1186/1471-2105-10-S1-S36

Computational prediction of novel non-coding RNAs in Arabidopsis thaliana. / Song, Dandan; Yang, Yang; Yu, Bin; Zheng, Binglian; Deng, Zhidong; Lu, Bao Liang; Chen, Xuemei; Jiang, Tao.

In: BMC bioinformatics, Vol. 10, No. SUPPL. 1, S36, 30.01.2009.

Research output: Contribution to journalArticle

Song, D, Yang, Y, Yu, B, Zheng, B, Deng, Z, Lu, BL, Chen, X & Jiang, T 2009, 'Computational prediction of novel non-coding RNAs in Arabidopsis thaliana', BMC bioinformatics, vol. 10, no. SUPPL. 1, S36. https://doi.org/10.1186/1471-2105-10-S1-S36
Song, Dandan ; Yang, Yang ; Yu, Bin ; Zheng, Binglian ; Deng, Zhidong ; Lu, Bao Liang ; Chen, Xuemei ; Jiang, Tao. / Computational prediction of novel non-coding RNAs in Arabidopsis thaliana. In: BMC bioinformatics. 2009 ; Vol. 10, No. SUPPL. 1.
@article{9a30ae697d674494b15e9fdf43beeebd,
title = "Computational prediction of novel non-coding RNAs in Arabidopsis thaliana",
abstract = "Background: Non-coding RNA (ncRNA) genes do not encode proteins but produce functional RNA molecules that play crucial roles in many key biological processes. Recent genome-wide transcriptional profiling studies using tiling arrays in organisms such as human and Arabidopsis have revealed a great number of transcripts, a large portion of which have little or no capability to encode proteins. This unexpected finding suggests that the currently known repertoire of ncRNAs may only represent a small fraction of ncRNAs of the organisms. Thus, efficient and effective prediction of ncRNAs has become an important task in bioinformatics in recent years. Among the available computational methods, the comparative genomic approach seems to be the most powerful to detect ncRNAs. The recent completion of the sequencing of several major plant genomes has made the approach possible for plants. Results: We have developed a pipeline to predict novel ncRNAs in the Arabidopsis (Arabidopsis thaliana) genome. It starts by comparing the expressed intergenic regions of Arabidopsis as provided in two whole-genome high-density oligo-probe arrays from the literature with the intergenic nucleotide sequences of all completely sequenced plant genomes including rice (Oryza sativa), poplar (Populus trichocarpa), grape (Vitis vinifera), and papaya (Carica papaya). By using multiple sequence alignment, a popular ncRNA prediction program (RNAz), wet-bench experimental validation, protein-coding potential analysis, and stringent screening against various ncRNA databases, the pipeline resulted in 16 families of novel ncRNAs (with a total of 21 ncRNAs). Conclusion: In this paper, we undertake a genome-wide search for novel ncRNAs in the genome of Arabidopsis by a comparative genomics approach. The identified novel ncRNAs are evolutionarily conserved between Arabidopsis and other recently sequenced plants, and may conduct interesting novel biological functions.",
author = "Dandan Song and Yang Yang and Bin Yu and Binglian Zheng and Zhidong Deng and Lu, {Bao Liang} and Xuemei Chen and Tao Jiang",
year = "2009",
month = "1",
day = "30",
doi = "10.1186/1471-2105-10-S1-S36",
language = "English (US)",
volume = "10",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "SUPPL. 1",

}

TY - JOUR

T1 - Computational prediction of novel non-coding RNAs in Arabidopsis thaliana

AU - Song, Dandan

AU - Yang, Yang

AU - Yu, Bin

AU - Zheng, Binglian

AU - Deng, Zhidong

AU - Lu, Bao Liang

AU - Chen, Xuemei

AU - Jiang, Tao

PY - 2009/1/30

Y1 - 2009/1/30

N2 - Background: Non-coding RNA (ncRNA) genes do not encode proteins but produce functional RNA molecules that play crucial roles in many key biological processes. Recent genome-wide transcriptional profiling studies using tiling arrays in organisms such as human and Arabidopsis have revealed a great number of transcripts, a large portion of which have little or no capability to encode proteins. This unexpected finding suggests that the currently known repertoire of ncRNAs may only represent a small fraction of ncRNAs of the organisms. Thus, efficient and effective prediction of ncRNAs has become an important task in bioinformatics in recent years. Among the available computational methods, the comparative genomic approach seems to be the most powerful to detect ncRNAs. The recent completion of the sequencing of several major plant genomes has made the approach possible for plants. Results: We have developed a pipeline to predict novel ncRNAs in the Arabidopsis (Arabidopsis thaliana) genome. It starts by comparing the expressed intergenic regions of Arabidopsis as provided in two whole-genome high-density oligo-probe arrays from the literature with the intergenic nucleotide sequences of all completely sequenced plant genomes including rice (Oryza sativa), poplar (Populus trichocarpa), grape (Vitis vinifera), and papaya (Carica papaya). By using multiple sequence alignment, a popular ncRNA prediction program (RNAz), wet-bench experimental validation, protein-coding potential analysis, and stringent screening against various ncRNA databases, the pipeline resulted in 16 families of novel ncRNAs (with a total of 21 ncRNAs). Conclusion: In this paper, we undertake a genome-wide search for novel ncRNAs in the genome of Arabidopsis by a comparative genomics approach. The identified novel ncRNAs are evolutionarily conserved between Arabidopsis and other recently sequenced plants, and may conduct interesting novel biological functions.

AB - Background: Non-coding RNA (ncRNA) genes do not encode proteins but produce functional RNA molecules that play crucial roles in many key biological processes. Recent genome-wide transcriptional profiling studies using tiling arrays in organisms such as human and Arabidopsis have revealed a great number of transcripts, a large portion of which have little or no capability to encode proteins. This unexpected finding suggests that the currently known repertoire of ncRNAs may only represent a small fraction of ncRNAs of the organisms. Thus, efficient and effective prediction of ncRNAs has become an important task in bioinformatics in recent years. Among the available computational methods, the comparative genomic approach seems to be the most powerful to detect ncRNAs. The recent completion of the sequencing of several major plant genomes has made the approach possible for plants. Results: We have developed a pipeline to predict novel ncRNAs in the Arabidopsis (Arabidopsis thaliana) genome. It starts by comparing the expressed intergenic regions of Arabidopsis as provided in two whole-genome high-density oligo-probe arrays from the literature with the intergenic nucleotide sequences of all completely sequenced plant genomes including rice (Oryza sativa), poplar (Populus trichocarpa), grape (Vitis vinifera), and papaya (Carica papaya). By using multiple sequence alignment, a popular ncRNA prediction program (RNAz), wet-bench experimental validation, protein-coding potential analysis, and stringent screening against various ncRNA databases, the pipeline resulted in 16 families of novel ncRNAs (with a total of 21 ncRNAs). Conclusion: In this paper, we undertake a genome-wide search for novel ncRNAs in the genome of Arabidopsis by a comparative genomics approach. The identified novel ncRNAs are evolutionarily conserved between Arabidopsis and other recently sequenced plants, and may conduct interesting novel biological functions.

UR - http://www.scopus.com/inward/record.url?scp=60949084179&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=60949084179&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-10-S1-S36

DO - 10.1186/1471-2105-10-S1-S36

M3 - Article

VL - 10

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - SUPPL. 1

M1 - S36

ER -