If you are experiencing problems downloading PDF or HTML fulltext, our helpdesk recommend clearing your browser cache and trying again. If you need help in clearing your cache, please click here . Still need help? Email help@ingentaconnect.com

Unsupervised Classification of Chemical Compounds

$48.00 plus tax (Refund Policy)

Download / Buy Article:


Clustering chemical compounds of similar structure is important in the pharmaceutical industry. One way of describing the structure is the chemical `fingerprint'. The fingerprint is a string of binary digits, and typical data sets consist of very large numbers of fingerprints; a suitable clustering procedure must take account of the properties of this method of coding, and must be able to handle large data sets. This paper describes the analysis of a set of fingerprint data. The analysis was based on an appropriate distance measure derived from the fingerprints, followed by metric scaling into a low-dimensional space. An approximation to metric scaling, suitable for very large data sets, was investigated. Cluster analysis using two programs, mclust and AutoClass-C, was carried out on the scaled data.

Keywords: Chemical fingerprint; Cluster analysis; Metric scaling; Rand index

Document Type: Original Article

DOI: http://dx.doi.org/10.1111/1467-9876.00146

Affiliations: University of Oxford, UK

Publication date: January 1, 1999

Related content



Share Content

Access Key

Free Content
Free content
New Content
New content
Open Access Content
Open access content
Subscribed Content
Subscribed content
Free Trial Content
Free trial content
Cookie Policy
Cookie Policy
ingentaconnect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more