On identifying and analyzing significant nodes in protein-protein interaction networks

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Network theory has been used for modeling biological data as well as social networks, transportation logistics, business transcripts, and many other types of data sets. Identifying important features/parts of these networks for a multitude of applications is becoming increasingly significant as the need for big data analysis techniques grows. When analyzing a network of protein-protein interactions (PPIs), identifying nodes of significant importance can direct the user toward biologically relevant network features. In this work, we propose that a node of structural importance in a network model can correspond to a biologically vital or significant property. This relationship between topological and biological importance can be seen in/between structurally defined nodes, such as hub nodes and driver nodes, within a network and within clusters. This work proposes data mining approaches for identification and examination of relationships between hub and driver nodes within human, yeast, rat, and mouse PPI networks. Relationships with other types of significant nodes, with direct neighbors, and with the rest of the network were analyzed to determine if the model can be characterized biologically by its structural makeup. We performed numerous tests on structure with a data-driven mentality, looking for properties that were potentially significant on a network level and then comparing those properties to biological significance. Our results showed that identifying and cross-referencing different types of topologically significant nodes can exemplify properties such as transcription factor enrichment, lethality, clustering, and Gene Ontology (GO) enrichment. Mining the biological networks, we discovered a key relationship between network properties and how sparse/dense a network is-a property we described as 'sparseness'. Overall, structurally important nodes were found to have significant biological relevance.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013
PublisherIEEE Computer Society
Pages343-348
Number of pages6
DOIs
StatePublished - 2013
Event2013 13th IEEE International Conference on Data Mining Workshops, ICDMW 2013 - Dallas, TX
Duration: Dec 7 2013Dec 10 2013

Other

Other2013 13th IEEE International Conference on Data Mining Workshops, ICDMW 2013
CityDallas, TX
Period12/7/1312/10/13

Fingerprint

Proteins
Transcription factors
Circuit theory
Yeast
Data mining
Ontology
Logistics
Rats
Genes
Industry
Big data

Keywords

  • Clustering
  • Driver nodes
  • Graph theory
  • Hub nodes
  • Network enrichment
  • Protein-protein interaction networks

ASJC Scopus subject areas

  • Software

Cite this

Khazanchi, R., Cooper, K. M., Thapa, I., & Ali, H. H. (2013). On identifying and analyzing significant nodes in protein-protein interaction networks. In Proceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013 (pp. 343-348). [6753940] IEEE Computer Society. https://doi.org/10.1109/ICDMW.2013.126

On identifying and analyzing significant nodes in protein-protein interaction networks. / Khazanchi, Rohan; Cooper, Kathryn M; Thapa, Ishwor; Ali, Hesham H.

Proceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013. IEEE Computer Society, 2013. p. 343-348 6753940.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Khazanchi, R, Cooper, KM, Thapa, I & Ali, HH 2013, On identifying and analyzing significant nodes in protein-protein interaction networks. in Proceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013., 6753940, IEEE Computer Society, pp. 343-348, 2013 13th IEEE International Conference on Data Mining Workshops, ICDMW 2013, Dallas, TX, 12/7/13. https://doi.org/10.1109/ICDMW.2013.126
Khazanchi R, Cooper KM, Thapa I, Ali HH. On identifying and analyzing significant nodes in protein-protein interaction networks. In Proceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013. IEEE Computer Society. 2013. p. 343-348. 6753940 https://doi.org/10.1109/ICDMW.2013.126
Khazanchi, Rohan ; Cooper, Kathryn M ; Thapa, Ishwor ; Ali, Hesham H. / On identifying and analyzing significant nodes in protein-protein interaction networks. Proceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013. IEEE Computer Society, 2013. pp. 343-348
@inproceedings{63e4eda0e69b476186be19352f024a8b,
title = "On identifying and analyzing significant nodes in protein-protein interaction networks",
abstract = "Network theory has been used for modeling biological data as well as social networks, transportation logistics, business transcripts, and many other types of data sets. Identifying important features/parts of these networks for a multitude of applications is becoming increasingly significant as the need for big data analysis techniques grows. When analyzing a network of protein-protein interactions (PPIs), identifying nodes of significant importance can direct the user toward biologically relevant network features. In this work, we propose that a node of structural importance in a network model can correspond to a biologically vital or significant property. This relationship between topological and biological importance can be seen in/between structurally defined nodes, such as hub nodes and driver nodes, within a network and within clusters. This work proposes data mining approaches for identification and examination of relationships between hub and driver nodes within human, yeast, rat, and mouse PPI networks. Relationships with other types of significant nodes, with direct neighbors, and with the rest of the network were analyzed to determine if the model can be characterized biologically by its structural makeup. We performed numerous tests on structure with a data-driven mentality, looking for properties that were potentially significant on a network level and then comparing those properties to biological significance. Our results showed that identifying and cross-referencing different types of topologically significant nodes can exemplify properties such as transcription factor enrichment, lethality, clustering, and Gene Ontology (GO) enrichment. Mining the biological networks, we discovered a key relationship between network properties and how sparse/dense a network is-a property we described as 'sparseness'. Overall, structurally important nodes were found to have significant biological relevance.",
keywords = "Clustering, Driver nodes, Graph theory, Hub nodes, Network enrichment, Protein-protein interaction networks",
author = "Rohan Khazanchi and Cooper, {Kathryn M} and Ishwor Thapa and Ali, {Hesham H}",
year = "2013",
doi = "10.1109/ICDMW.2013.126",
language = "English (US)",
pages = "343--348",
booktitle = "Proceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - On identifying and analyzing significant nodes in protein-protein interaction networks

AU - Khazanchi, Rohan

AU - Cooper, Kathryn M

AU - Thapa, Ishwor

AU - Ali, Hesham H

PY - 2013

Y1 - 2013

N2 - Network theory has been used for modeling biological data as well as social networks, transportation logistics, business transcripts, and many other types of data sets. Identifying important features/parts of these networks for a multitude of applications is becoming increasingly significant as the need for big data analysis techniques grows. When analyzing a network of protein-protein interactions (PPIs), identifying nodes of significant importance can direct the user toward biologically relevant network features. In this work, we propose that a node of structural importance in a network model can correspond to a biologically vital or significant property. This relationship between topological and biological importance can be seen in/between structurally defined nodes, such as hub nodes and driver nodes, within a network and within clusters. This work proposes data mining approaches for identification and examination of relationships between hub and driver nodes within human, yeast, rat, and mouse PPI networks. Relationships with other types of significant nodes, with direct neighbors, and with the rest of the network were analyzed to determine if the model can be characterized biologically by its structural makeup. We performed numerous tests on structure with a data-driven mentality, looking for properties that were potentially significant on a network level and then comparing those properties to biological significance. Our results showed that identifying and cross-referencing different types of topologically significant nodes can exemplify properties such as transcription factor enrichment, lethality, clustering, and Gene Ontology (GO) enrichment. Mining the biological networks, we discovered a key relationship between network properties and how sparse/dense a network is-a property we described as 'sparseness'. Overall, structurally important nodes were found to have significant biological relevance.

AB - Network theory has been used for modeling biological data as well as social networks, transportation logistics, business transcripts, and many other types of data sets. Identifying important features/parts of these networks for a multitude of applications is becoming increasingly significant as the need for big data analysis techniques grows. When analyzing a network of protein-protein interactions (PPIs), identifying nodes of significant importance can direct the user toward biologically relevant network features. In this work, we propose that a node of structural importance in a network model can correspond to a biologically vital or significant property. This relationship between topological and biological importance can be seen in/between structurally defined nodes, such as hub nodes and driver nodes, within a network and within clusters. This work proposes data mining approaches for identification and examination of relationships between hub and driver nodes within human, yeast, rat, and mouse PPI networks. Relationships with other types of significant nodes, with direct neighbors, and with the rest of the network were analyzed to determine if the model can be characterized biologically by its structural makeup. We performed numerous tests on structure with a data-driven mentality, looking for properties that were potentially significant on a network level and then comparing those properties to biological significance. Our results showed that identifying and cross-referencing different types of topologically significant nodes can exemplify properties such as transcription factor enrichment, lethality, clustering, and Gene Ontology (GO) enrichment. Mining the biological networks, we discovered a key relationship between network properties and how sparse/dense a network is-a property we described as 'sparseness'. Overall, structurally important nodes were found to have significant biological relevance.

KW - Clustering

KW - Driver nodes

KW - Graph theory

KW - Hub nodes

KW - Network enrichment

KW - Protein-protein interaction networks

UR - http://www.scopus.com/inward/record.url?scp=84898032643&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84898032643&partnerID=8YFLogxK

U2 - 10.1109/ICDMW.2013.126

DO - 10.1109/ICDMW.2013.126

M3 - Conference contribution

AN - SCOPUS:84898032643

SP - 343

EP - 348

BT - Proceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013

PB - IEEE Computer Society

ER -