Skip to main content
padlock icon - secure page this page is secure

Performance Comparison of Clustering Algorithms on Scientific Publications

Buy Article:

$106.64 + tax (Refund Policy)

The enormous increase of scientific papers in digital form has increased document management complexity. The development of effective and efficient methods to sort and organize the documents is thus very crucial. Clustering is one of data mining techniques widely applied in various field that may be used to resolve the issue. This paper presents the performance comparison of partitioning-based clustering algorithms, namely random clustering, k-means, x-means, and k-medoids in an unsupervised classification of scientific publications based on topic similarity. Rapidminer is utilized to preprocess and analyze the data. Afterwards, the purity value and processing time of each algorithm are investigated. The results show that k-means performs the best purity value, although its run time is not the fastest. Meanwhile random clustering offers the fastest processing time with the lowest purity value trade-off. None of the observed algorithms produce best purity and processing time at once. It may due to the complex of parameters that affect the clustering results, inter alia, the type of data, selected algorithm, distance measures, and preprocessing methods.
No Reference information available - sign in for access.
No Citation information available - sign in for access.
No Supplementary Data.
No Article Media
No Metrics

Keywords: Clustering; Rapidminer; Text Mining

Document Type: Research Article

Affiliations: Department of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia

Publication date: April 1, 2017

More about this publication?
  • ADVANCED SCIENCE LETTERS is an international peer-reviewed journal with a very wide-ranging coverage, consolidates research activities in all areas of (1) Physical Sciences, (2) Biological Sciences, (3) Mathematical Sciences, (4) Engineering, (5) Computer and Information Sciences, and (6) Geosciences to publish original short communications, full research papers and timely brief (mini) reviews with authors photo and biography encompassing the basic and applied research and current developments in educational aspects of these scientific areas.
  • Editorial Board
  • Information for Authors
  • Subscribe to this Title
  • Ingenta Connect is not responsible for the content or availability of external websites
  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content
Cookie Policy
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more