A graph-theoretic modeling on GO space for biological interpretation of gene clusters

Sung Geun Lee, Jung Uk Hur, Yang Seok Kim

Research output: Contribution to journalArticle

66 Citations (Scopus)

Abstract

Motivation: With the advent of DNA microarray technologies, the parallel quantification of genome-wide transcriptions has been a great opportunity to systematically understand the complicated biological phenomena. Amidst the enthusiastic investigations into the intricate gene expression data, clustering methods have been the useful tools to uncover the meaningful patterns hidden in those data. The mathematical techniques, however, entirely based on the numerical expression data, do not show biologically relevant information on the clustering results. Results: We present a novel methodology for biological interpretation of gene clusters. Our graph theoretic algorithm extracts common biological attributes of the genes within a cluster or a group of interest through the modified structure of gene ontology (GO) called GO tree. After genes are annotated with GO terms, the hierarchical nature of GO terms is used to find the representative biological meanings of the gene clusters. In addition, the biological significance of gene clusters can be assessed quantitatively by defining a distance function on the GO tree. Our approach has a complementary meaning to many statistical clustering techniques; we can see clustering problems from a different viewpoint by use of biological ontology. We applied this algorithm to the well-known data set and successfully obtained the biological features of the gene clusters with the quantitative biological assessment of clustering quality through GO Biological Process.

Original languageEnglish (US)
Pages (from-to)381-388
Number of pages8
JournalBioinformatics
Volume20
Issue number3
DOIs
StatePublished - Feb 12 2004

Fingerprint

Gene Ontology
Multigene Family
Ontology
Genes
Cluster Analysis
Gene
Graph in graph theory
Modeling
Clustering
Biological Phenomena
Biological Ontologies
Public Opinion
DNA Microarray
Data Clustering
Term
Distance Function
Gene Expression Data
Oligonucleotide Array Sequence Analysis
Clustering Methods
Quantification

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

A graph-theoretic modeling on GO space for biological interpretation of gene clusters. / Lee, Sung Geun; Hur, Jung Uk; Kim, Yang Seok.

In: Bioinformatics, Vol. 20, No. 3, 12.02.2004, p. 381-388.

Research output: Contribution to journalArticle

Lee, Sung Geun ; Hur, Jung Uk ; Kim, Yang Seok. / A graph-theoretic modeling on GO space for biological interpretation of gene clusters. In: Bioinformatics. 2004 ; Vol. 20, No. 3. pp. 381-388.
@article{3827c4d940c044ee8332e2474e968219,
title = "A graph-theoretic modeling on GO space for biological interpretation of gene clusters",
abstract = "Motivation: With the advent of DNA microarray technologies, the parallel quantification of genome-wide transcriptions has been a great opportunity to systematically understand the complicated biological phenomena. Amidst the enthusiastic investigations into the intricate gene expression data, clustering methods have been the useful tools to uncover the meaningful patterns hidden in those data. The mathematical techniques, however, entirely based on the numerical expression data, do not show biologically relevant information on the clustering results. Results: We present a novel methodology for biological interpretation of gene clusters. Our graph theoretic algorithm extracts common biological attributes of the genes within a cluster or a group of interest through the modified structure of gene ontology (GO) called GO tree. After genes are annotated with GO terms, the hierarchical nature of GO terms is used to find the representative biological meanings of the gene clusters. In addition, the biological significance of gene clusters can be assessed quantitatively by defining a distance function on the GO tree. Our approach has a complementary meaning to many statistical clustering techniques; we can see clustering problems from a different viewpoint by use of biological ontology. We applied this algorithm to the well-known data set and successfully obtained the biological features of the gene clusters with the quantitative biological assessment of clustering quality through GO Biological Process.",
author = "Lee, {Sung Geun} and Hur, {Jung Uk} and Kim, {Yang Seok}",
year = "2004",
month = "2",
day = "12",
doi = "10.1093/bioinformatics/btg420",
language = "English (US)",
volume = "20",
pages = "381--388",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "3",

}

TY - JOUR

T1 - A graph-theoretic modeling on GO space for biological interpretation of gene clusters

AU - Lee, Sung Geun

AU - Hur, Jung Uk

AU - Kim, Yang Seok

PY - 2004/2/12

Y1 - 2004/2/12

N2 - Motivation: With the advent of DNA microarray technologies, the parallel quantification of genome-wide transcriptions has been a great opportunity to systematically understand the complicated biological phenomena. Amidst the enthusiastic investigations into the intricate gene expression data, clustering methods have been the useful tools to uncover the meaningful patterns hidden in those data. The mathematical techniques, however, entirely based on the numerical expression data, do not show biologically relevant information on the clustering results. Results: We present a novel methodology for biological interpretation of gene clusters. Our graph theoretic algorithm extracts common biological attributes of the genes within a cluster or a group of interest through the modified structure of gene ontology (GO) called GO tree. After genes are annotated with GO terms, the hierarchical nature of GO terms is used to find the representative biological meanings of the gene clusters. In addition, the biological significance of gene clusters can be assessed quantitatively by defining a distance function on the GO tree. Our approach has a complementary meaning to many statistical clustering techniques; we can see clustering problems from a different viewpoint by use of biological ontology. We applied this algorithm to the well-known data set and successfully obtained the biological features of the gene clusters with the quantitative biological assessment of clustering quality through GO Biological Process.

AB - Motivation: With the advent of DNA microarray technologies, the parallel quantification of genome-wide transcriptions has been a great opportunity to systematically understand the complicated biological phenomena. Amidst the enthusiastic investigations into the intricate gene expression data, clustering methods have been the useful tools to uncover the meaningful patterns hidden in those data. The mathematical techniques, however, entirely based on the numerical expression data, do not show biologically relevant information on the clustering results. Results: We present a novel methodology for biological interpretation of gene clusters. Our graph theoretic algorithm extracts common biological attributes of the genes within a cluster or a group of interest through the modified structure of gene ontology (GO) called GO tree. After genes are annotated with GO terms, the hierarchical nature of GO terms is used to find the representative biological meanings of the gene clusters. In addition, the biological significance of gene clusters can be assessed quantitatively by defining a distance function on the GO tree. Our approach has a complementary meaning to many statistical clustering techniques; we can see clustering problems from a different viewpoint by use of biological ontology. We applied this algorithm to the well-known data set and successfully obtained the biological features of the gene clusters with the quantitative biological assessment of clustering quality through GO Biological Process.

UR - http://www.scopus.com/inward/record.url?scp=1342330530&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1342330530&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btg420

DO - 10.1093/bioinformatics/btg420

M3 - Article

C2 - 14960465

AN - SCOPUS:1342330530

VL - 20

SP - 381

EP - 388

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 3

ER -