3dswap-pred: Prediction of 3D Domain Swapping from Protein Sequence Using Random Forest Approach
Abstract:3D domain swapping is a protein structural phenomenon that mediates the formation of the higher order oligomers in a variety of proteins with different structural and functional properties. 3D domain swapping is associated with a variety of biological functions ranging from oligomerization to pathological conformational diseases. 3D domain swapping is realised subsequent to structure determination where the protein is observed in the swapped conformation in the oligomeric state. This is a limiting step to understand this important structural phenomenon in a large scale from the growing sequence data. A new machine learning approach, 3dswap-pred, has been developed for the prediction of 3D domain swapping in protein structures from mere sequence data using the Random Forest approach. 3Dswap-pred is implemented using a positive sequence dataset derived from literature based structural curation of 297 structures. A negative sequence dataset is obtained from 462 SCOP domains using a new sequence data mining approach and a set of 126 sequencederived features. Statistical validation using an independent dataset of 68 positive sequences and 313 negative sequences revealed that 3dswap-pred achieved an accuracy of 63.8%. A webserver is also implemented using the 3dswap-pred Random Forest model. The server is available from the URL: http://caps.ncbs.res.in/3dswap-pred
Keywords: 3D domain swapping; AAINDEX; CD-HIT; COP domains; DIAL; GPCR; NMR; PSIPRED; Random Forest approach; hinge region; machine learning; prediction algorithm; protein oligomer; random forest; swapped region
Document Type: Research Article
Publication date: October 1, 2011
- Protein & Peptide Letters publishes short papers in all important aspects of protein and peptide research, including structural studies, recombinant expression, function, synthesis, enzymology, immunology, molecular modeling, drug design etc. Manuscripts must have a significant element of novelty, timeliness and urgency that merit rapid publication. Reports of crystallisation, and preliminary structure determinations of biologically important proteins are acceptable. Purely theoretical papers are also acceptable provided they provide new insight into the principles of protein/peptide structure and function.