Abstract
Studies on phosphorylation are important but challenging for both wet-bench experiments and computational studies, and accurate non-kinase-specific prediction tools are highly desirable for whole-genome annotation in a wide variety of species. Here, we describe a phosphorylation site prediction webserver, PhosphoSVM, that employs Support Vector Machine to combine protein secondary structure information and seven other one-dimensional structural properties, including Shannon entropy, relative entropy, predicted protein disorder information, predicted solvent accessible area, amino acid overlapping properties, averaged cumulative hydrophobicity, and subsequence k-nearest neighbor profiles. This method achieved AUC values of 0.8405/0.8183/0.7383 for serine (S), threonine (T), and tyrosine (Y) phosphorylation sites, respectively, in animals with a tenfold cross-validation. The model trained by the animal phosphorylation sites was also applied to a plant phosphorylation site dataset as an independent test. The AUC values for the independent test data set were 0.7761/0.6652/0.5958 for S/T/Y phosphorylation sites, respectively. This algorithm with the optimally trained model was implemented as a webserver. The webserver, trained model, and all datasets used in the current study are available at http://sysbio.unl.edu/PhosphoSVM.
Original language | English (US) |
---|---|
Title of host publication | Methods in Molecular Biology |
Publisher | Humana Press Inc. |
Pages | 265-274 |
Number of pages | 10 |
DOIs | |
State | Published - Jan 1 2017 |
Publication series
Name | Methods in Molecular Biology |
---|---|
Volume | 1484 |
ISSN (Print) | 1064-3745 |
Fingerprint
Keywords
- Non-kinase-specific tool
- Phosphorylation site prediction
- Support vector machine
ASJC Scopus subject areas
- Molecular Biology
- Genetics
Cite this
Prediction of protein phosphorylation sites by integrating secondary structure information and other one-dimensional structural properties. / Dou, Yongchao; Yao, Bo; Zhang, Chi.
Methods in Molecular Biology. Humana Press Inc., 2017. p. 265-274 (Methods in Molecular Biology; Vol. 1484).Research output: Chapter in Book/Report/Conference proceeding › Chapter
}
TY - CHAP
T1 - Prediction of protein phosphorylation sites by integrating secondary structure information and other one-dimensional structural properties
AU - Dou, Yongchao
AU - Yao, Bo
AU - Zhang, Chi
PY - 2017/1/1
Y1 - 2017/1/1
N2 - Studies on phosphorylation are important but challenging for both wet-bench experiments and computational studies, and accurate non-kinase-specific prediction tools are highly desirable for whole-genome annotation in a wide variety of species. Here, we describe a phosphorylation site prediction webserver, PhosphoSVM, that employs Support Vector Machine to combine protein secondary structure information and seven other one-dimensional structural properties, including Shannon entropy, relative entropy, predicted protein disorder information, predicted solvent accessible area, amino acid overlapping properties, averaged cumulative hydrophobicity, and subsequence k-nearest neighbor profiles. This method achieved AUC values of 0.8405/0.8183/0.7383 for serine (S), threonine (T), and tyrosine (Y) phosphorylation sites, respectively, in animals with a tenfold cross-validation. The model trained by the animal phosphorylation sites was also applied to a plant phosphorylation site dataset as an independent test. The AUC values for the independent test data set were 0.7761/0.6652/0.5958 for S/T/Y phosphorylation sites, respectively. This algorithm with the optimally trained model was implemented as a webserver. The webserver, trained model, and all datasets used in the current study are available at http://sysbio.unl.edu/PhosphoSVM.
AB - Studies on phosphorylation are important but challenging for both wet-bench experiments and computational studies, and accurate non-kinase-specific prediction tools are highly desirable for whole-genome annotation in a wide variety of species. Here, we describe a phosphorylation site prediction webserver, PhosphoSVM, that employs Support Vector Machine to combine protein secondary structure information and seven other one-dimensional structural properties, including Shannon entropy, relative entropy, predicted protein disorder information, predicted solvent accessible area, amino acid overlapping properties, averaged cumulative hydrophobicity, and subsequence k-nearest neighbor profiles. This method achieved AUC values of 0.8405/0.8183/0.7383 for serine (S), threonine (T), and tyrosine (Y) phosphorylation sites, respectively, in animals with a tenfold cross-validation. The model trained by the animal phosphorylation sites was also applied to a plant phosphorylation site dataset as an independent test. The AUC values for the independent test data set were 0.7761/0.6652/0.5958 for S/T/Y phosphorylation sites, respectively. This algorithm with the optimally trained model was implemented as a webserver. The webserver, trained model, and all datasets used in the current study are available at http://sysbio.unl.edu/PhosphoSVM.
KW - Non-kinase-specific tool
KW - Phosphorylation site prediction
KW - Support vector machine
UR - http://www.scopus.com/inward/record.url?scp=84994291874&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84994291874&partnerID=8YFLogxK
U2 - 10.1007/978-1-4939-6406-2_18
DO - 10.1007/978-1-4939-6406-2_18
M3 - Chapter
C2 - 27787832
AN - SCOPUS:84994291874
T3 - Methods in Molecular Biology
SP - 265
EP - 274
BT - Methods in Molecular Biology
PB - Humana Press Inc.
ER -