Improving the Classification of Nuclear Receptors with Feature Selection

$63.10 plus tax (Refund Policy)

Buy Article:

Abstract:

Nuclear receptors are involved in multiple cellular signaling pathways that affect and regulate processes. Because of their physiology and pathophysiology significance, classification of nuclear receptors is essential for the proper understanding of their functions. Bhasin and Raghava have shown that the subfamilies of nuclear receptors are closely correlated with their amino acid composition and dipeptide composition [29]. They characterized each protein by a 400 dimensional feature vector. However, using high dimensional feature vectors for characterization of protein sequences will increase the computational cost as well as the risk of overfitting. Therefore, using only those features that are most relevant to the present task might improve the prediction system, and might also provide us with some biologically useful knowledge. In this paper a feature selection approach was proposed to identify relevant features and a prediction engine of support vector machines was developed to estimate the prediction accuracy of classification using the selected features. A reduced subset containing 30 features was accepted to characterize the protein sequences in view of its good discriminative power towards the classes, in which 18 are of amino acid composition and 12 are of dipeptide composition. This reduced feature subset resulted in an overall accuracy of 98.9% in a 5-fold cross-validation test, higher than 88.7% of amino acid composition based method and almost as high as 99.3% of dipeptide composition based method. Moreover, an overall accuracy of 93.7% was reached when it was evaluated on a blind data set of 63 nuclear receptors. On the other hand, an overall accuracy of 96.1% and 95.2% based on the reduced 12 dipeptide compositions was observed simultaneously in the 5-fold cross-validation test and the blind data set test, respectively. These results demonstrate the effectiveness of the present method.





Keywords: Nuclear receptor; bioinformatics; feature selection; protein function; support vector machine

Document Type: Research Article

DOI: http://dx.doi.org/10.2174/092986609788681733

Publication date: July 1, 2009

More about this publication?
  • Protein & Peptide Letters publishes short papers in all important aspects of protein and peptide research, including structural studies, recombinant expression, function, synthesis, enzymology, immunology, molecular modeling, drug design etc. Manuscripts must have a significant element of novelty, timeliness and urgency that merit rapid publication. Reports of crystallisation, and preliminary structure determinations of biologically important proteins are acceptable. Purely theoretical papers are also acceptable provided they provide new insight into the principles of protein/peptide structure and function.
Related content

Tools

Favourites

Share Content

Access Key

Free Content
Free content
New Content
New content
Open Access Content
Open access content
Subscribed Content
Subscribed content
Free Trial Content
Free trial content
Cookie Policy
X
Cookie Policy
ingentaconnect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more