Predicting Protein Subcellular Localization by Pseudo Amino Acid Composition with a Segment-Weighted and Features-Combined Approach
Authors: Wang, Wei; Geng, XingBo; Dou, Yongchao; Liu, Taigang; Zheng, Xiaoqi
Source: Protein and Peptide Letters, Volume 18, Number 5, May 2011 , pp. 480-487(8)
Publisher: Bentham Science Publishers
Abstract:Information of protein subcellular location plays an important role in molecular cell biology. Prediction of the subcellular location of proteins will help to understand their functions and interactions. In this paper, a different mode of pseudo amino acid composition was proposed to represent protein samples for predicting their subcellular localization via the following procedures: based on the optimal splice site of each protein sequence, we divided a sequence into sorting signal part and mature protein part, and extracted sequence features from each part separately. Then, the combined features were fed into the SVM classifier to perform the prediction. By the jackknife test on a benchmark dataset in which none of proteins included has more than 90% pairwise sequence identity to any other, the overall accuracies achieved by the method are 94.5% and 90.3% for prokaryotic and eukaryotic proteins, respectively. The results indicate that the prediction quality by our method is quite satisfactory. It is anticipated that the current method may serve as an alternative approach to the existing prediction methods.
Keywords: GalNAc-transferase; Intel Xeon; Jackknife test; MCC; SVM classifier; fluorescence microscopy; mature protein; optimal splice site; prokaryotic and eukaryotic proteins; pseudo amino acid composition; segment-weighted; sorting signal; stereochemical; subcellular localization; transmembrane proteins
Document Type: Research Article
Publication date: 2011-05-01
- Protein & Peptide Letters publishes short papers in all important aspects of protein and peptide research, including structural studies, recombinant expression, function, synthesis, enzymology, immunology, molecular modeling, drug design etc. Manuscripts must have a significant element of novelty, timeliness and urgency that merit rapid publication. Reports of crystallisation, and preliminary structure determinations of biologically important proteins are acceptable. Purely theoretical papers are also acceptable provided they provide new insight into the principles of protein/peptide structure and function.