Characterization of soybean genomic features by analysis of its expressed sequence tags

Ai Guo Tian, Jun Wang, Peng Cui, Yu Jun Han, Hao Xu, Li Juan Cong, Xian Gang Huang, Xiao Ling Wang, Yong Zhi Jiao, Bang Jun Wang, Yong Jun Wang, Jin Song Zhang, Shou Yi Chen

Research output: Contribution to journalArticle

70 Citations (Scopus)

Abstract

We analyzed 314,254 soybean expressed sequence tags (ESTs), including 29,540 from our laboratory and 284,714 from GenBank. These ESTs were assembled into 56,147 unigenes. About 76.92% of the unigenes were homologous to genes from Arabidopsis thaliana (Arabidopsis). The putative products of these unigenes were annotated according to their homology with the categorized proteins of Arabidopsis. Genes corresponding to cell growth and/or maintenance, enzymes and cell communi-cation belonged to the slow-evolving class, whereas genes related to transcription regulation, cell, binding and death appeared to be fast-evolving. Soybean unigenes with no match to genes within the Arabidopsis genome were identified as soybean-specific genes. These genes were mainly involved in nodule development and the synthesis of seed storage proteins. In addition, we also identified 61 genes regulated by salicylic acid, 1,322 transcription factor genes and 326 disease resistance-like genes from soybean unigenes. SSR analysis showed that the soybean genome was more complex than the Arabidopsis and the Medicago truncatula genomes. GC content in soybean unigene sequences is similar to that in Arabidopsis and M. truncatula. Furthermore, the combined analysis of the EST database and the BAC-contig sequences revealed that the total gene number in the soybean genome is about 63,501.

Original languageEnglish (US)
Pages (from-to)903-913
Number of pages11
JournalTheoretical and Applied Genetics
Volume108
Issue number5
DOIs
StatePublished - Mar 1 2004

Fingerprint

Expressed Sequence Tags
expressed sequence tags
Soybeans
unigenes
soybeans
genomics
Arabidopsis
Genes
genes
Medicago truncatula
Genome
genome
Seed Storage Proteins
Arabidopsis Proteins
Disease Resistance
seed storage proteins
Salicylic Acid
Nucleic Acid Databases
Base Composition
salicylic acid

ASJC Scopus subject areas

  • Biotechnology
  • Agronomy and Crop Science
  • Genetics

Cite this

Tian, A. G., Wang, J., Cui, P., Han, Y. J., Xu, H., Cong, L. J., ... Chen, S. Y. (2004). Characterization of soybean genomic features by analysis of its expressed sequence tags. Theoretical and Applied Genetics, 108(5), 903-913. https://doi.org/10.1007/s00122-003-1499-2

Characterization of soybean genomic features by analysis of its expressed sequence tags. / Tian, Ai Guo; Wang, Jun; Cui, Peng; Han, Yu Jun; Xu, Hao; Cong, Li Juan; Huang, Xian Gang; Wang, Xiao Ling; Jiao, Yong Zhi; Wang, Bang Jun; Wang, Yong Jun; Zhang, Jin Song; Chen, Shou Yi.

In: Theoretical and Applied Genetics, Vol. 108, No. 5, 01.03.2004, p. 903-913.

Research output: Contribution to journalArticle

Tian, AG, Wang, J, Cui, P, Han, YJ, Xu, H, Cong, LJ, Huang, XG, Wang, XL, Jiao, YZ, Wang, BJ, Wang, YJ, Zhang, JS & Chen, SY 2004, 'Characterization of soybean genomic features by analysis of its expressed sequence tags', Theoretical and Applied Genetics, vol. 108, no. 5, pp. 903-913. https://doi.org/10.1007/s00122-003-1499-2
Tian, Ai Guo ; Wang, Jun ; Cui, Peng ; Han, Yu Jun ; Xu, Hao ; Cong, Li Juan ; Huang, Xian Gang ; Wang, Xiao Ling ; Jiao, Yong Zhi ; Wang, Bang Jun ; Wang, Yong Jun ; Zhang, Jin Song ; Chen, Shou Yi. / Characterization of soybean genomic features by analysis of its expressed sequence tags. In: Theoretical and Applied Genetics. 2004 ; Vol. 108, No. 5. pp. 903-913.
@article{92de6a8977084f32a8717534e97633ef,
title = "Characterization of soybean genomic features by analysis of its expressed sequence tags",
abstract = "We analyzed 314,254 soybean expressed sequence tags (ESTs), including 29,540 from our laboratory and 284,714 from GenBank. These ESTs were assembled into 56,147 unigenes. About 76.92{\%} of the unigenes were homologous to genes from Arabidopsis thaliana (Arabidopsis). The putative products of these unigenes were annotated according to their homology with the categorized proteins of Arabidopsis. Genes corresponding to cell growth and/or maintenance, enzymes and cell communi-cation belonged to the slow-evolving class, whereas genes related to transcription regulation, cell, binding and death appeared to be fast-evolving. Soybean unigenes with no match to genes within the Arabidopsis genome were identified as soybean-specific genes. These genes were mainly involved in nodule development and the synthesis of seed storage proteins. In addition, we also identified 61 genes regulated by salicylic acid, 1,322 transcription factor genes and 326 disease resistance-like genes from soybean unigenes. SSR analysis showed that the soybean genome was more complex than the Arabidopsis and the Medicago truncatula genomes. GC content in soybean unigene sequences is similar to that in Arabidopsis and M. truncatula. Furthermore, the combined analysis of the EST database and the BAC-contig sequences revealed that the total gene number in the soybean genome is about 63,501.",
author = "Tian, {Ai Guo} and Jun Wang and Peng Cui and Han, {Yu Jun} and Hao Xu and Cong, {Li Juan} and Huang, {Xian Gang} and Wang, {Xiao Ling} and Jiao, {Yong Zhi} and Wang, {Bang Jun} and Wang, {Yong Jun} and Zhang, {Jin Song} and Chen, {Shou Yi}",
year = "2004",
month = "3",
day = "1",
doi = "10.1007/s00122-003-1499-2",
language = "English (US)",
volume = "108",
pages = "903--913",
journal = "Theoretical And Applied Genetics",
issn = "0040-5752",
publisher = "Springer Verlag",
number = "5",

}

TY - JOUR

T1 - Characterization of soybean genomic features by analysis of its expressed sequence tags

AU - Tian, Ai Guo

AU - Wang, Jun

AU - Cui, Peng

AU - Han, Yu Jun

AU - Xu, Hao

AU - Cong, Li Juan

AU - Huang, Xian Gang

AU - Wang, Xiao Ling

AU - Jiao, Yong Zhi

AU - Wang, Bang Jun

AU - Wang, Yong Jun

AU - Zhang, Jin Song

AU - Chen, Shou Yi

PY - 2004/3/1

Y1 - 2004/3/1

N2 - We analyzed 314,254 soybean expressed sequence tags (ESTs), including 29,540 from our laboratory and 284,714 from GenBank. These ESTs were assembled into 56,147 unigenes. About 76.92% of the unigenes were homologous to genes from Arabidopsis thaliana (Arabidopsis). The putative products of these unigenes were annotated according to their homology with the categorized proteins of Arabidopsis. Genes corresponding to cell growth and/or maintenance, enzymes and cell communi-cation belonged to the slow-evolving class, whereas genes related to transcription regulation, cell, binding and death appeared to be fast-evolving. Soybean unigenes with no match to genes within the Arabidopsis genome were identified as soybean-specific genes. These genes were mainly involved in nodule development and the synthesis of seed storage proteins. In addition, we also identified 61 genes regulated by salicylic acid, 1,322 transcription factor genes and 326 disease resistance-like genes from soybean unigenes. SSR analysis showed that the soybean genome was more complex than the Arabidopsis and the Medicago truncatula genomes. GC content in soybean unigene sequences is similar to that in Arabidopsis and M. truncatula. Furthermore, the combined analysis of the EST database and the BAC-contig sequences revealed that the total gene number in the soybean genome is about 63,501.

AB - We analyzed 314,254 soybean expressed sequence tags (ESTs), including 29,540 from our laboratory and 284,714 from GenBank. These ESTs were assembled into 56,147 unigenes. About 76.92% of the unigenes were homologous to genes from Arabidopsis thaliana (Arabidopsis). The putative products of these unigenes were annotated according to their homology with the categorized proteins of Arabidopsis. Genes corresponding to cell growth and/or maintenance, enzymes and cell communi-cation belonged to the slow-evolving class, whereas genes related to transcription regulation, cell, binding and death appeared to be fast-evolving. Soybean unigenes with no match to genes within the Arabidopsis genome were identified as soybean-specific genes. These genes were mainly involved in nodule development and the synthesis of seed storage proteins. In addition, we also identified 61 genes regulated by salicylic acid, 1,322 transcription factor genes and 326 disease resistance-like genes from soybean unigenes. SSR analysis showed that the soybean genome was more complex than the Arabidopsis and the Medicago truncatula genomes. GC content in soybean unigene sequences is similar to that in Arabidopsis and M. truncatula. Furthermore, the combined analysis of the EST database and the BAC-contig sequences revealed that the total gene number in the soybean genome is about 63,501.

UR - http://www.scopus.com/inward/record.url?scp=1842685867&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1842685867&partnerID=8YFLogxK

U2 - 10.1007/s00122-003-1499-2

DO - 10.1007/s00122-003-1499-2

M3 - Article

C2 - 14624337

AN - SCOPUS:1842685867

VL - 108

SP - 903

EP - 913

JO - Theoretical And Applied Genetics

JF - Theoretical And Applied Genetics

SN - 0040-5752

IS - 5

ER -