Protein family classification with discriminant function analysis

Etsuko Moriyama, Junhyong Kim

Research output: Chapter in Book/Report/Conference proceedingChapter

11 Citations (Scopus)

Abstract

Rapid progress in multiple genome projects continues to feed databases in the world a large volume of sequence data. In this "post-genomic" era, more efficient and reliable sequence annotation, especially functional annotation of protein sequences, is crucial. Although experimental confirmation is ultimately required, computational annotation of protein sequences has been routinely done, and it is incorporated into major protein databases (e.g., SWISS-PROT: http://www.expasy. org/sprot/, PIR-PSD: http://pir.georgetown.edu/ pirwww/search/textpsd.shtml). Due to a rapidly growing number of new sequences, increasingly more database entries contain only computational annotations. In this paper, we first discuss the disadvantage commonly found in various existing protein classification methods. Next we introduce a set of new methods that can classify protein family sharing very weak similarity. Finally, we describe an algorithm that combines strengths from various protein classification methods to obtain an optimum power for protein classifications.

Original languageEnglish (US)
Title of host publicationGenome Exploitation
Subtitle of host publicationData Mining the Genome
PublisherSpringer US
Pages121-132
Number of pages12
ISBN (Print)038724123X, 9780387241234
DOIs
StatePublished - Dec 1 2005

Fingerprint

Proteins
Genes

ASJC Scopus subject areas

  • Materials Science(all)
  • Chemistry(all)

Cite this

Moriyama, E., & Kim, J. (2005). Protein family classification with discriminant function analysis. In Genome Exploitation: Data Mining the Genome (pp. 121-132). Springer US. https://doi.org/10.1007/0-387-24187-6_9

Protein family classification with discriminant function analysis. / Moriyama, Etsuko; Kim, Junhyong.

Genome Exploitation: Data Mining the Genome. Springer US, 2005. p. 121-132.

Research output: Chapter in Book/Report/Conference proceedingChapter

Moriyama, E & Kim, J 2005, Protein family classification with discriminant function analysis. in Genome Exploitation: Data Mining the Genome. Springer US, pp. 121-132. https://doi.org/10.1007/0-387-24187-6_9
Moriyama E, Kim J. Protein family classification with discriminant function analysis. In Genome Exploitation: Data Mining the Genome. Springer US. 2005. p. 121-132 https://doi.org/10.1007/0-387-24187-6_9
Moriyama, Etsuko ; Kim, Junhyong. / Protein family classification with discriminant function analysis. Genome Exploitation: Data Mining the Genome. Springer US, 2005. pp. 121-132
@inbook{fc010bb7ba284bc8adab56a23a86ae08,
title = "Protein family classification with discriminant function analysis",
abstract = "Rapid progress in multiple genome projects continues to feed databases in the world a large volume of sequence data. In this {"}post-genomic{"} era, more efficient and reliable sequence annotation, especially functional annotation of protein sequences, is crucial. Although experimental confirmation is ultimately required, computational annotation of protein sequences has been routinely done, and it is incorporated into major protein databases (e.g., SWISS-PROT: http://www.expasy. org/sprot/, PIR-PSD: http://pir.georgetown.edu/ pirwww/search/textpsd.shtml). Due to a rapidly growing number of new sequences, increasingly more database entries contain only computational annotations. In this paper, we first discuss the disadvantage commonly found in various existing protein classification methods. Next we introduce a set of new methods that can classify protein family sharing very weak similarity. Finally, we describe an algorithm that combines strengths from various protein classification methods to obtain an optimum power for protein classifications.",
author = "Etsuko Moriyama and Junhyong Kim",
year = "2005",
month = "12",
day = "1",
doi = "10.1007/0-387-24187-6_9",
language = "English (US)",
isbn = "038724123X",
pages = "121--132",
booktitle = "Genome Exploitation",
publisher = "Springer US",

}

TY - CHAP

T1 - Protein family classification with discriminant function analysis

AU - Moriyama, Etsuko

AU - Kim, Junhyong

PY - 2005/12/1

Y1 - 2005/12/1

N2 - Rapid progress in multiple genome projects continues to feed databases in the world a large volume of sequence data. In this "post-genomic" era, more efficient and reliable sequence annotation, especially functional annotation of protein sequences, is crucial. Although experimental confirmation is ultimately required, computational annotation of protein sequences has been routinely done, and it is incorporated into major protein databases (e.g., SWISS-PROT: http://www.expasy. org/sprot/, PIR-PSD: http://pir.georgetown.edu/ pirwww/search/textpsd.shtml). Due to a rapidly growing number of new sequences, increasingly more database entries contain only computational annotations. In this paper, we first discuss the disadvantage commonly found in various existing protein classification methods. Next we introduce a set of new methods that can classify protein family sharing very weak similarity. Finally, we describe an algorithm that combines strengths from various protein classification methods to obtain an optimum power for protein classifications.

AB - Rapid progress in multiple genome projects continues to feed databases in the world a large volume of sequence data. In this "post-genomic" era, more efficient and reliable sequence annotation, especially functional annotation of protein sequences, is crucial. Although experimental confirmation is ultimately required, computational annotation of protein sequences has been routinely done, and it is incorporated into major protein databases (e.g., SWISS-PROT: http://www.expasy. org/sprot/, PIR-PSD: http://pir.georgetown.edu/ pirwww/search/textpsd.shtml). Due to a rapidly growing number of new sequences, increasingly more database entries contain only computational annotations. In this paper, we first discuss the disadvantage commonly found in various existing protein classification methods. Next we introduce a set of new methods that can classify protein family sharing very weak similarity. Finally, we describe an algorithm that combines strengths from various protein classification methods to obtain an optimum power for protein classifications.

UR - http://www.scopus.com/inward/record.url?scp=84892199408&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84892199408&partnerID=8YFLogxK

U2 - 10.1007/0-387-24187-6_9

DO - 10.1007/0-387-24187-6_9

M3 - Chapter

SN - 038724123X

SN - 9780387241234

SP - 121

EP - 132

BT - Genome Exploitation

PB - Springer US

ER -