Pseudo Amino Acid Composition and its Applications in Bioinformatics, Proteomics and System Biology
With the avalanche of protein sequences generated in the post-genomic age, it is highly desired to develop automated methods for efficiently identifying various attributes of uncharacterized proteins. This is one of the most important tasks facing us today in bioinformatics, and the information thus obtained will have important impacts on the development of proteomics and system biology. To realize that, one of the keys is to find an effective model to represent the sample of a protein. The most straightforward model in this regard is its entire amino acid sequence; however, the entire sequence model would fail to work when the query protein did not have significant homology to proteins of known characteristics. Thus, various non-sequential models or discrete models were proposed. The simplest discrete model is the amino acid (AA) composition. Using it to represent a protein, however, all the sequence-order information would be completely lost. To cope with such a dilemma, the concept of pseudo amino acid (PseAA) composition was introduced. Its essence is to keep using a discrete model to represent a protein yet without completely losing its sequence-order information. Therefore, in a broad sense, the PseAA composition of a protein is actually a set of discrete numbers that is derived from its amino acid sequence and that is different from the classical AA composition and able to harbour some sort of sequence order or pattern information. Ever since the first PseAA composition was formulated to predict protein subcellular localization and membrane protein types, it has stimulated many different modes of PseAA composition for studying various kinds of problems in proteins and proteins-related systems. In this review, we shall give a brief and systematic introduction of various modes of PseAA composition and their applications. Meanwhile, the challenges for finding the optimal PseAA composition are also briefly discussed.
No Supplementary Data
No Article Media
Document Type: Research Article
Publication date: December 1, 2009
More about this publication?
- Current Proteomics research in the emerging field of proteomics is growing at an extremely rapid rate. The principal aim of Current Proteomics is to publish well-timed review articles in this fast-expanding area on topics relevant and significant to the development of proteomics. Current Proteomics is an essential journal for everyone involved in proteomics and related fields in both academia and industry.
- Editorial Board
- Information for Authors
- Subscribe to this Title
- Ingenta Connect is not responsible for the content or availability of external websites