MISAE: A new approach for regulatory motif extraction

Zhaohui Sun, Jingyi Yang, Jitender S. Deogun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

The recognition of regulatory motifs of co-regulated genes is essential for understanding the regulatory mechanisms. However, the automatic extraction of regulatory motifs from a given data set of the upstream non-coding DNA sequences of a family of co-regulated genes is difficult because regulatory motifs are often subtle and inexact. This problem is further complicated by the corruption of the data sets. In this paper, a new approach called Mismatch-allowed Probabilistic Suffix Tree Motif Extraction (MISAE) is proposed. It combines the mismatch-allowed probabilistic suffix tree that is a probabilistic model and local prediction for the extraction of regulatory motifs. The proposed approach is tested on 15 co-regulated gene families and compares favorably with other state-of-the-art approaches. Moreover, MISAE performs well on "corrupted" data sets. It is able to extract the motif from a "corrupted" data set with less than one fourth of the sequences containing the real motif.

Original languageEnglish (US)
Title of host publicationProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
PublisherIEEE Computer Society
Pages173-181
Number of pages9
ISBN (Print)0769521940, 9780769521947
StatePublished - Jan 1 2004
EventProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004 - Stanford, CA, United States
Duration: Aug 16 2004Aug 19 2004

Publication series

NameProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004

Conference

ConferenceProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
CountryUnited States
CityStanford, CA
Period8/16/048/19/04

Fingerprint

Genes
DNA sequences
Statistical Models

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Sun, Z., Yang, J., & Deogun, J. S. (2004). MISAE: A new approach for regulatory motif extraction. In Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004 (pp. 173-181). (Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004). IEEE Computer Society.

MISAE : A new approach for regulatory motif extraction. / Sun, Zhaohui; Yang, Jingyi; Deogun, Jitender S.

Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004. IEEE Computer Society, 2004. p. 173-181 (Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sun, Z, Yang, J & Deogun, JS 2004, MISAE: A new approach for regulatory motif extraction. in Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004. Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004, IEEE Computer Society, pp. 173-181, Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004, Stanford, CA, United States, 8/16/04.
Sun Z, Yang J, Deogun JS. MISAE: A new approach for regulatory motif extraction. In Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004. IEEE Computer Society. 2004. p. 173-181. (Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004).
Sun, Zhaohui ; Yang, Jingyi ; Deogun, Jitender S. / MISAE : A new approach for regulatory motif extraction. Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004. IEEE Computer Society, 2004. pp. 173-181 (Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004).
@inproceedings{7e2d2c644889460ca8bca75d28bb6a72,
title = "MISAE: A new approach for regulatory motif extraction",
abstract = "The recognition of regulatory motifs of co-regulated genes is essential for understanding the regulatory mechanisms. However, the automatic extraction of regulatory motifs from a given data set of the upstream non-coding DNA sequences of a family of co-regulated genes is difficult because regulatory motifs are often subtle and inexact. This problem is further complicated by the corruption of the data sets. In this paper, a new approach called Mismatch-allowed Probabilistic Suffix Tree Motif Extraction (MISAE) is proposed. It combines the mismatch-allowed probabilistic suffix tree that is a probabilistic model and local prediction for the extraction of regulatory motifs. The proposed approach is tested on 15 co-regulated gene families and compares favorably with other state-of-the-art approaches. Moreover, MISAE performs well on {"}corrupted{"} data sets. It is able to extract the motif from a {"}corrupted{"} data set with less than one fourth of the sequences containing the real motif.",
author = "Zhaohui Sun and Jingyi Yang and Deogun, {Jitender S.}",
year = "2004",
month = "1",
day = "1",
language = "English (US)",
isbn = "0769521940",
series = "Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004",
publisher = "IEEE Computer Society",
pages = "173--181",
booktitle = "Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004",

}

TY - GEN

T1 - MISAE

T2 - A new approach for regulatory motif extraction

AU - Sun, Zhaohui

AU - Yang, Jingyi

AU - Deogun, Jitender S.

PY - 2004/1/1

Y1 - 2004/1/1

N2 - The recognition of regulatory motifs of co-regulated genes is essential for understanding the regulatory mechanisms. However, the automatic extraction of regulatory motifs from a given data set of the upstream non-coding DNA sequences of a family of co-regulated genes is difficult because regulatory motifs are often subtle and inexact. This problem is further complicated by the corruption of the data sets. In this paper, a new approach called Mismatch-allowed Probabilistic Suffix Tree Motif Extraction (MISAE) is proposed. It combines the mismatch-allowed probabilistic suffix tree that is a probabilistic model and local prediction for the extraction of regulatory motifs. The proposed approach is tested on 15 co-regulated gene families and compares favorably with other state-of-the-art approaches. Moreover, MISAE performs well on "corrupted" data sets. It is able to extract the motif from a "corrupted" data set with less than one fourth of the sequences containing the real motif.

AB - The recognition of regulatory motifs of co-regulated genes is essential for understanding the regulatory mechanisms. However, the automatic extraction of regulatory motifs from a given data set of the upstream non-coding DNA sequences of a family of co-regulated genes is difficult because regulatory motifs are often subtle and inexact. This problem is further complicated by the corruption of the data sets. In this paper, a new approach called Mismatch-allowed Probabilistic Suffix Tree Motif Extraction (MISAE) is proposed. It combines the mismatch-allowed probabilistic suffix tree that is a probabilistic model and local prediction for the extraction of regulatory motifs. The proposed approach is tested on 15 co-regulated gene families and compares favorably with other state-of-the-art approaches. Moreover, MISAE performs well on "corrupted" data sets. It is able to extract the motif from a "corrupted" data set with less than one fourth of the sequences containing the real motif.

UR - http://www.scopus.com/inward/record.url?scp=14044261278&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=14044261278&partnerID=8YFLogxK

M3 - Conference contribution

C2 - 16448011

AN - SCOPUS:14044261278

SN - 0769521940

SN - 9780769521947

T3 - Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004

SP - 173

EP - 181

BT - Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004

PB - IEEE Computer Society

ER -