Skip to main content

Predicting Protein Fold Types by the General Form of Chou's Pseudo Amino Acid Composition: Approached from Optimal Feature Extractions

Buy Article:

$55.00 plus tax (Refund Policy)

Identification on protein folding types is always based on the 27-class folds dataset, which was provided by Ding & Dubchak in 2001. But with the avalanche of protein sequences, fold data is also expanding, so it will be the inevitable trend to improve the existing dataset and expand more folding types. In this paper, we construct a multi-class protein fold dataset, which contains 3,457 protein chains with sequence identity below 35% and could be classified into 76 fold types. It was 4 times larger than Ding & Dubchak's dataset. Furthermore, our work proposes a novel approach of support vector machine based on optimal features. By combining motif frequency, low-frequency power spectral density, amino acid composition, the predicted secondary structure and the values of auto-correlation function as feature parameters set, the method adopts criterion of the maximum correlation and the minimum redundancy to filter these features and obtain a 95-dimensions optimal feature subset. Based on the ensemble classification strategy, with 95-dimensions optimal feature as input parameters of support vector machine, we identify the 76-class protein folds and overall accuracy measures up to 44.92% by independent test. In addition, this method has been further used to identify upgraded 27-class protein folds, overall accuracy achieves 66.56%. At last, we also test our method on Ding & Dubchak's 27-class folds dataset and obtained better identification results than most of the previous reported results.
No References
No Citations
No Supplementary Data
No Article Media
No Metrics

Keywords: Criterion of maximum relevance minimum redundancy; Ding & Dubchak's dataset; Mad Cow disease; Parkinson's disease; bioinformatics; low-frequency power spectral density; motif frequency; optimal feature; protein fold; support vector machine

Document Type: Research Article

Publication date: 2012-02-01

More about this publication?
  • Protein & Peptide Letters publishes short papers in all important aspects of protein and peptide research, including structural studies, recombinant expression, function, synthesis, enzymology, immunology, molecular modeling, drug design etc. Manuscripts must have a significant element of novelty, timeliness and urgency that merit rapid publication. Reports of crystallisation, and preliminary structure determinations of biologically important proteins are acceptable. Purely theoretical papers are also acceptable provided they provide new insight into the principles of protein/peptide structure and function.
  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content
Cookie Policy
X
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more