Assessing the Predictive Power of Unsupervised Visualization Techniques to Improve the Identification of GPCR-Focused Compound Libraries

publication · 9 years ago
by Modest von Korff, Kurt Hilpert (Actelion)
Principal component analysis and self-organizing maps (SOMs) were compared to cluster and visualize the chemical space of a large and diverse data set. The data set comprised about 3000 G-protein-coupled receptor (GPCR) ligands for about 130 receptors and 3000 non-GPCR ligands from the World Drug Index. The molecules were described with a topological pharmacophore point histogram descriptor and a chemical fingerprint descriptor. To assess the predictive power of the clustering, a leave-multiple-out cross validation with k nearest neighbor classification was performed. The results of the classification tests and the visualization showed a clear superiority of the SOM method. SOM correctly divided the data set into two main clusters, one for the GPCR and the other for the non-GPCR ligands. Our results suggest that a continuous GPCR-ligand space exists.
Visit publication