Skip to main content

Using Affinity Propagation Combined Post-Processing to Cluster Protein Sequences

Buy Article:

$55.00 plus tax (Refund Policy)

The sizes of the protein databases are growing rapidly nowadays thus clustering protein sequences based only on sequence information becomes increasingly important. In this paper, we analyze the limitation of Affinity propagation (AP) algorithm when clustering a dataset generated randomly. Then we propose a post-processing method to improve the AP algorithm. This method uses the median of the input similarities as the shared preference value, and then employs post-processing phase combined mergence and reassignment strategy on the results of the AP algorithm. We have tested our method extensively and compared its performance with other five methods on several datasets of COG (Clusters of Orthologous Groups of proteins) database, SCOP and G-protein family. The number of clusters obtained for a given set of proteins approximate to the correct number of clusters in that set. Moreover, in our experiments, the quality of the clusters as quantified by F-measure was better than that of others (on average, 9% better than BlastClust, 33% better than TribeMCL, 34% better than CLUSS, 59% better than Spectral clustering and 41% better than AP).

No References
No Citations
No Supplementary Data
No Article Media
No Metrics

Keywords: COG; Clustering; F-measure; affinity propagation; post-processing; protein sequence

Document Type: Research Article

Publication date: 2010-06-01

More about this publication?
  • Protein & Peptide Letters publishes short papers in all important aspects of protein and peptide research, including structural studies, recombinant expression, function, synthesis, enzymology, immunology, molecular modeling, drug design etc. Manuscripts must have a significant element of novelty, timeliness and urgency that merit rapid publication. Reports of crystallisation, and preliminary structure determinations of biologically important proteins are acceptable. Purely theoretical papers are also acceptable provided they provide new insight into the principles of protein/peptide structure and function.
  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content
Cookie Policy
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more