Evaluating Long-Term Relationship of Protein Sequence by Use of DInterval Conditional Probability and Its Impact on Protein Structural Class Prediction
Abstract:To fix the large and expanding gap between sequence known proteins and structure known proteins, it is important to study on protein structural class prediction (PSCP) for its foundation and usefulness in protein structure analysis. In this paper, the d-interval conditional probability index was proposed to reflect the long-term correlation between amino acids. Based on this index, the impact of residues' long-term relationship on PSCP was analyzed. Two new information theory based algorithms were proposed and were used combining with the long-term information between residues to predict protein structural class (PSC). The dataset 5714 was tested for its low sequence similarity and high reliability. The result showed that the new index was 3-6% higher than traditional index by use of the same algorithms, and the PSCP accuracy was 4-10% improved using the new algorithms. The presented index, algorithms and the long-term relationship of residues on PSCP can be extensively applied in other sequence based protein structure analysis.
Document Type: Research Article
Publication date: 2009-10-01
More about this publication?
- Protein & Peptide Letters publishes short papers in all important aspects of protein and peptide research, including structural studies, recombinant expression, function, synthesis, enzymology, immunology, molecular modeling, drug design etc. Manuscripts must have a significant element of novelty, timeliness and urgency that merit rapid publication. Reports of crystallisation, and preliminary structure determinations of biologically important proteins are acceptable. Purely theoretical papers are also acceptable provided they provide new insight into the principles of protein/peptide structure and function.