Predicting Membrane Protein Types by the LLDA Algorithm
Abstract:Membrane proteins are generally classified into the following eight types: (1) type I transmembrane, (2) type II, (3) type III, (4) type IV, (5) multipass transmembrane, (6) lipid-chain-anchored membrane, (7) GPI-anchored membrane, and (8) peripheral membrane (K.C. Chou and H.B. Shen: BBRC, 2007, 360: 339-345). Knowing the type of an uncharacterized membrane protein often provides useful clues for finding its biological function and interaction process with other molecules in a biological system. With the explosion of protein sequences generated in the Post-Genomic Age, it is urgent to develop an automated method to deal with such a challenge. Recently, the PsePSSM (Pseudo Position-Specific Score Matrix) descriptor is proposed by Chou and Shen (Biochem. Biophys. Res. Comm. 2007, 360, 339-345) to represent a protein sample. The advantage of the PsePSSM descriptor is that it can combine the evolution information and sequencecorrelated information. However, incorporating all these effects into a descriptor may cause the “high dimension disaster”. To overcome such a problem, the fusion approach was adopted by Chou and Shen. Here, a completely different approach, the so-called LLDA (Local Linear Discriminant Analysis) is introduced to extract the key features from the highdimensional PsePSSM space. The dimension-reduced descriptor vector thus obtained is a compact representation of the original high dimensional vector. Our jackknife and independent dataset test results indicate that it is very promising to use the LLDA approach to cope with complicated problems in biological systems, such as predicting the membrane protein type.
Document Type: Research Article
Publication date: September 1, 2008
More about this publication?
- Protein & Peptide Letters publishes short papers in all important aspects of protein and peptide research, including structural studies, recombinant expression, function, synthesis, enzymology, immunology, molecular modeling, drug design etc. Manuscripts must have a significant element of novelty, timeliness and urgency that merit rapid publication. Reports of crystallisation, and preliminary structure determinations of biologically important proteins are acceptable. Purely theoretical papers are also acceptable provided they provide new insight into the principles of protein/peptide structure and function.