Literature mining and ontology based analysis of host-Brucella gene-gene interaction network

Ilknur Karadeniz, Junguk Hur, Yongqun He, Arzucan Özgür

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Brucella is an intracellular bacterium that causes chronic brucellosis in humans and various mammals. The identification of host-Brucella interaction is crucial to understand host immunity against Brucella infection and Brucella pathogenesis against host immune responses. Most of the information about the inter-species interactions between host and Brucella genes is only available in the text of the scientific publications. Many text-mining systems for extracting gene and protein interactions have been proposed. However, only a few of them have been designed by considering the peculiarities of host-pathogen interactions. In this paper, we used a text mining approach for extracting host-Brucella gene-gene interactions from the abstracts of articles in PubMed. The gene-gene interactions here represent the interactions between genes and/or gene products (e.g., proteins). The SciMiner tool, originally designed for detecting mammalian gene/protein names in text, was extended to identify host and Brucella gene/protein names in the abstracts. Next, sentence-level and abstract-level co-occurrence based approaches, as well as sentence-level machine learning based methods, originally designed for extracting intra-species gene interactions, were utilized to extract the interactions among the identified host and Brucella genes. The extracted interactions were manually evaluated. A total of 46 host-Brucella gene interactions were identified and represented as an interaction network. Twenty four of these interactions were identified from sentence-level processing. Twenty two additional interactions were identified when abstract-level processing was performed. The Interaction Network Ontology (INO) was used to represent the identified interaction types at a hierarchical ontology structure. Ontological modeling of specific gene-gene interactions demonstrates that host-pathogen gene-gene interactions occur at experimental conditions which can be ontologically represented. Our results show that the introduced literature mining and ontology-based modeling approach are effective in retrieving and analyzing host-pathogen gene-gene interaction networks.

Original languageEnglish (US)
Article number01386
JournalFrontiers in Microbiology
Volume6
Issue numberDEC
DOIs
StatePublished - Jan 1 2015
Externally publishedYes

Fingerprint

Brucella
Gene Regulatory Networks
Genes
Host-Pathogen Interactions
Data Mining
Names
Proteins
Brucellosis
PubMed
Publications

Keywords

  • Brucella
  • Host and pathogen gene name recognition
  • Host-pathogen interaction extraction
  • Interaction Network Ontology (INO)
  • SciMiner
  • Support vector machines (SVM)
  • Text mining

ASJC Scopus subject areas

  • Microbiology
  • Microbiology (medical)

Cite this

Literature mining and ontology based analysis of host-Brucella gene-gene interaction network. / Karadeniz, Ilknur; Hur, Junguk; He, Yongqun; Özgür, Arzucan.

In: Frontiers in Microbiology, Vol. 6, No. DEC, 01386, 01.01.2015.

Research output: Contribution to journalArticle

@article{c765961cf06b4ce58a5965a51ed9f18e,
title = "Literature mining and ontology based analysis of host-Brucella gene-gene interaction network",
abstract = "Brucella is an intracellular bacterium that causes chronic brucellosis in humans and various mammals. The identification of host-Brucella interaction is crucial to understand host immunity against Brucella infection and Brucella pathogenesis against host immune responses. Most of the information about the inter-species interactions between host and Brucella genes is only available in the text of the scientific publications. Many text-mining systems for extracting gene and protein interactions have been proposed. However, only a few of them have been designed by considering the peculiarities of host-pathogen interactions. In this paper, we used a text mining approach for extracting host-Brucella gene-gene interactions from the abstracts of articles in PubMed. The gene-gene interactions here represent the interactions between genes and/or gene products (e.g., proteins). The SciMiner tool, originally designed for detecting mammalian gene/protein names in text, was extended to identify host and Brucella gene/protein names in the abstracts. Next, sentence-level and abstract-level co-occurrence based approaches, as well as sentence-level machine learning based methods, originally designed for extracting intra-species gene interactions, were utilized to extract the interactions among the identified host and Brucella genes. The extracted interactions were manually evaluated. A total of 46 host-Brucella gene interactions were identified and represented as an interaction network. Twenty four of these interactions were identified from sentence-level processing. Twenty two additional interactions were identified when abstract-level processing was performed. The Interaction Network Ontology (INO) was used to represent the identified interaction types at a hierarchical ontology structure. Ontological modeling of specific gene-gene interactions demonstrates that host-pathogen gene-gene interactions occur at experimental conditions which can be ontologically represented. Our results show that the introduced literature mining and ontology-based modeling approach are effective in retrieving and analyzing host-pathogen gene-gene interaction networks.",
keywords = "Brucella, Host and pathogen gene name recognition, Host-pathogen interaction extraction, Interaction Network Ontology (INO), SciMiner, Support vector machines (SVM), Text mining",
author = "Ilknur Karadeniz and Junguk Hur and Yongqun He and Arzucan {\"O}zg{\"u}r",
year = "2015",
month = "1",
day = "1",
doi = "10.3389/fmicb.2015.01386",
language = "English (US)",
volume = "6",
journal = "Frontiers in Microbiology",
issn = "1664-302X",
publisher = "Frontiers Media S. A.",
number = "DEC",

}

TY - JOUR

T1 - Literature mining and ontology based analysis of host-Brucella gene-gene interaction network

AU - Karadeniz, Ilknur

AU - Hur, Junguk

AU - He, Yongqun

AU - Özgür, Arzucan

PY - 2015/1/1

Y1 - 2015/1/1

N2 - Brucella is an intracellular bacterium that causes chronic brucellosis in humans and various mammals. The identification of host-Brucella interaction is crucial to understand host immunity against Brucella infection and Brucella pathogenesis against host immune responses. Most of the information about the inter-species interactions between host and Brucella genes is only available in the text of the scientific publications. Many text-mining systems for extracting gene and protein interactions have been proposed. However, only a few of them have been designed by considering the peculiarities of host-pathogen interactions. In this paper, we used a text mining approach for extracting host-Brucella gene-gene interactions from the abstracts of articles in PubMed. The gene-gene interactions here represent the interactions between genes and/or gene products (e.g., proteins). The SciMiner tool, originally designed for detecting mammalian gene/protein names in text, was extended to identify host and Brucella gene/protein names in the abstracts. Next, sentence-level and abstract-level co-occurrence based approaches, as well as sentence-level machine learning based methods, originally designed for extracting intra-species gene interactions, were utilized to extract the interactions among the identified host and Brucella genes. The extracted interactions were manually evaluated. A total of 46 host-Brucella gene interactions were identified and represented as an interaction network. Twenty four of these interactions were identified from sentence-level processing. Twenty two additional interactions were identified when abstract-level processing was performed. The Interaction Network Ontology (INO) was used to represent the identified interaction types at a hierarchical ontology structure. Ontological modeling of specific gene-gene interactions demonstrates that host-pathogen gene-gene interactions occur at experimental conditions which can be ontologically represented. Our results show that the introduced literature mining and ontology-based modeling approach are effective in retrieving and analyzing host-pathogen gene-gene interaction networks.

AB - Brucella is an intracellular bacterium that causes chronic brucellosis in humans and various mammals. The identification of host-Brucella interaction is crucial to understand host immunity against Brucella infection and Brucella pathogenesis against host immune responses. Most of the information about the inter-species interactions between host and Brucella genes is only available in the text of the scientific publications. Many text-mining systems for extracting gene and protein interactions have been proposed. However, only a few of them have been designed by considering the peculiarities of host-pathogen interactions. In this paper, we used a text mining approach for extracting host-Brucella gene-gene interactions from the abstracts of articles in PubMed. The gene-gene interactions here represent the interactions between genes and/or gene products (e.g., proteins). The SciMiner tool, originally designed for detecting mammalian gene/protein names in text, was extended to identify host and Brucella gene/protein names in the abstracts. Next, sentence-level and abstract-level co-occurrence based approaches, as well as sentence-level machine learning based methods, originally designed for extracting intra-species gene interactions, were utilized to extract the interactions among the identified host and Brucella genes. The extracted interactions were manually evaluated. A total of 46 host-Brucella gene interactions were identified and represented as an interaction network. Twenty four of these interactions were identified from sentence-level processing. Twenty two additional interactions were identified when abstract-level processing was performed. The Interaction Network Ontology (INO) was used to represent the identified interaction types at a hierarchical ontology structure. Ontological modeling of specific gene-gene interactions demonstrates that host-pathogen gene-gene interactions occur at experimental conditions which can be ontologically represented. Our results show that the introduced literature mining and ontology-based modeling approach are effective in retrieving and analyzing host-pathogen gene-gene interaction networks.

KW - Brucella

KW - Host and pathogen gene name recognition

KW - Host-pathogen interaction extraction

KW - Interaction Network Ontology (INO)

KW - SciMiner

KW - Support vector machines (SVM)

KW - Text mining

UR - http://www.scopus.com/inward/record.url?scp=84953863601&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84953863601&partnerID=8YFLogxK

U2 - 10.3389/fmicb.2015.01386

DO - 10.3389/fmicb.2015.01386

M3 - Article

VL - 6

JO - Frontiers in Microbiology

JF - Frontiers in Microbiology

SN - 1664-302X

IS - DEC

M1 - 01386

ER -