Maximum Common Substructure Based Hierarchical Clustering
Clustering chemical structures is a widely used method in various phases of the drug discovery process. Possible applications range from clustering virtual hit sets consisting of 1000’s of structures to clustering million member compound libraries. Traditional clustering methods are based on similarity scores. These techniques are highly efficient from a bare computational point of view but results are often hard to interpret, even by experts.
A clustering technique that results in highly intuitive grouping of structures can be introduced by the use of the concept of maximum common substructure (MCS).