Protein complexes involve in most if not all of essential biological processes in a living cell. Many attempts have been devoted to identify protein complexes using computational methods, most of which exploit protein-protein interaction networks to search intensively interacting proteins
as a protein complex. Besides identifying protein complexes, knowing their biological functions may help unlock their molecular mechanisms and their roles in related biological processes. Therefore, it is also desirable to computationally predict the functions of protein complexes. However,
no literature has been found to address such a problem. This paper attempts to address the problem by choosing yeast as the model organism, where total 50 protein complexes are collected and their functions are validated by solid experiments. Each of the complexes was encoded by a numeric
vector based upon their graphic and functional properties. Feature selection techniques, including Minimum Redundancy Maximum Relevance and Incremental Feature Selection, were adopted to extract core features for the prediction. Three different prediction methods, Nearest Neighbor Algorithm,
Bayesian network and Sequential Minimal Optimization, were utilized in this study and tested by jackknife crossvalidation test. Consequently, 22 core features coupled with Nearest Neighbor Algorithm gain the highest accuracy. These core features are regarded as the most important features
for the determination of the biological functions of protein complexes. 19 out of 22 core features were from functional properties, indicating that the functions of each protein component probably constrain the overall functions of the protein complex.
No Supplementary Data
No Article Media
incremental feature selection;
minimum redundancy maximum
nearest neighbor algorithm;
prediction of functions of protein complex;
Document Type: Research Article
November 1, 2013
More about this publication?
Current Bioinformatics aims to publish all the latest and outstanding developments in bioinformatics. Each issue contains a series of timely, in-depth reviews written by leaders in the field, covering a wide range of the integration of biology with computer and information science.
The journal focuses on reviews on advances in computational molecular/structural biology, encompassing areas such as computing in biomedicine and genomics, computational proteomics and systems biology, and metabolic pathway engineering. Developments in these fields have direct implications on key issues related to health care, medicine, genetic disorders, development of agricultural products, renewable energy, environmental protection, etc.
Current Bioinformatics is an essential journal for all academic and industrial researchers who want expert knowledge on all major advances in bioinformatics.
- Editorial Board
- Information for Authors
- Subscribe to this Title
- Ingenta Connect is not responsible for the content or availability of external websites