Primary structure similarity analysis of proteins sequences by a new graphical representation

S. C. Xu, Z. Li, S. P. Zhang, J. L. Hu

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

A new graphical description of the primary structure of protein sequences is introduced. First, a three-dimensional space discrete point set of a protein sequence is created based on the three main physicochemical properties of the amino acids. Secondly, a continuous cubic B-spline curve interpolating the amino acid points is constructed to represent the shape of the protein sequence. Then the geometric properties (curvature and torsion) of the continuous curve are extracted for the purpose of analyzing the similarity between protein sequences. Finally, an improved Canberra distance comparison is introduced for the similarity analysis of protein sequences with different lengths. Experimental results show that our method is effective for the similarity comparison of protein sequences.

Original languageEnglish (US)
Pages (from-to)791-803
Number of pages13
JournalSAR and QSAR in Environmental Research
Volume25
Issue number10
DOIs
StatePublished - Oct 2014

Fingerprint

Protein Sequence Analysis
Proteins
Amino Acids
Amino acids
Amino Acid Sequence
Splines
Torsional stress

Keywords

  • cubic B-spline curve
  • curvature
  • protein sequence
  • shape analysis
  • torsion

ASJC Scopus subject areas

  • Bioengineering
  • Molecular Medicine
  • Drug Discovery

Cite this

Primary structure similarity analysis of proteins sequences by a new graphical representation. / Xu, S. C.; Li, Z.; Zhang, S. P.; Hu, J. L.

In: SAR and QSAR in Environmental Research, Vol. 25, No. 10, 10.2014, p. 791-803.

Research output: Contribution to journalArticle

Xu, S. C. ; Li, Z. ; Zhang, S. P. ; Hu, J. L. / Primary structure similarity analysis of proteins sequences by a new graphical representation. In: SAR and QSAR in Environmental Research. 2014 ; Vol. 25, No. 10. pp. 791-803.
@article{e58539e2148d49d495eb9ff057263773,
title = "Primary structure similarity analysis of proteins sequences by a new graphical representation",
abstract = "A new graphical description of the primary structure of protein sequences is introduced. First, a three-dimensional space discrete point set of a protein sequence is created based on the three main physicochemical properties of the amino acids. Secondly, a continuous cubic B-spline curve interpolating the amino acid points is constructed to represent the shape of the protein sequence. Then the geometric properties (curvature and torsion) of the continuous curve are extracted for the purpose of analyzing the similarity between protein sequences. Finally, an improved Canberra distance comparison is introduced for the similarity analysis of protein sequences with different lengths. Experimental results show that our method is effective for the similarity comparison of protein sequences.",
keywords = "cubic B-spline curve, curvature, protein sequence, shape analysis, torsion",
author = "Xu, {S. C.} and Z. Li and Zhang, {S. P.} and Hu, {J. L.}",
year = "2014",
month = "10",
doi = "10.1080/1062936X.2014.955055",
language = "English (US)",
volume = "25",
pages = "791--803",
journal = "SAR and QSAR in Environmental Research",
issn = "1062-936X",
publisher = "Taylor and Francis Ltd.",
number = "10",

}

TY - JOUR

T1 - Primary structure similarity analysis of proteins sequences by a new graphical representation

AU - Xu, S. C.

AU - Li, Z.

AU - Zhang, S. P.

AU - Hu, J. L.

PY - 2014/10

Y1 - 2014/10

N2 - A new graphical description of the primary structure of protein sequences is introduced. First, a three-dimensional space discrete point set of a protein sequence is created based on the three main physicochemical properties of the amino acids. Secondly, a continuous cubic B-spline curve interpolating the amino acid points is constructed to represent the shape of the protein sequence. Then the geometric properties (curvature and torsion) of the continuous curve are extracted for the purpose of analyzing the similarity between protein sequences. Finally, an improved Canberra distance comparison is introduced for the similarity analysis of protein sequences with different lengths. Experimental results show that our method is effective for the similarity comparison of protein sequences.

AB - A new graphical description of the primary structure of protein sequences is introduced. First, a three-dimensional space discrete point set of a protein sequence is created based on the three main physicochemical properties of the amino acids. Secondly, a continuous cubic B-spline curve interpolating the amino acid points is constructed to represent the shape of the protein sequence. Then the geometric properties (curvature and torsion) of the continuous curve are extracted for the purpose of analyzing the similarity between protein sequences. Finally, an improved Canberra distance comparison is introduced for the similarity analysis of protein sequences with different lengths. Experimental results show that our method is effective for the similarity comparison of protein sequences.

KW - cubic B-spline curve

KW - curvature

KW - protein sequence

KW - shape analysis

KW - torsion

UR - http://www.scopus.com/inward/record.url?scp=84907689504&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84907689504&partnerID=8YFLogxK

U2 - 10.1080/1062936X.2014.955055

DO - 10.1080/1062936X.2014.955055

M3 - Article

C2 - 25242152

AN - SCOPUS:84907689504

VL - 25

SP - 791

EP - 803

JO - SAR and QSAR in Environmental Research

JF - SAR and QSAR in Environmental Research

SN - 1062-936X

IS - 10

ER -