Drug research is often termed as searching for a needle in a haystack. Virtual screening is widely recognized as a valuable tool to effectively reduce the size of the 'haystack' by about one order of magnitude. In this presentation a technique that can further improve the efficiency of the screening procedure is proposed.

We focus on topological pharmacophore similarity based search where only a set of known active 2D structures is known. The pharmacophores of these structures are analysed by perceiving the pharmacophoric characteristic of each individual atom. Pharmacophore patterns are transformed into a topological cross correlation histogram. These correlation histograms are molecular descriptors that represent the pharmacophoric character of structures in a mathematically tractable form. Proximities (metrics) like the Euclidean distance and the Tanimoto coefficient are applied to estimate the dissimilarity between two such descriptors. The canonic formulae of the proximities are extended with weights and other parameters to help bias the metrics behavior when comparing two compounds. Parameters are optimized in an automated training process that uses a subset of the target library and a subset of the known active structures. The optimized proximities are then passed on to an independent validation stage, which evaluates by calculating the enrichment ratio achieved within the virtual screening process.

Optimized virtual screening is capable of reducing the size of the 'haystack' by another order of magnitude (in some cases an even higher reduction is achieved) and it can also lead to scaffold hopping. The method is generic enough to adapt to other molecular descriptors and metrics. The efficiency of the method and cross-validated results will be presented.

eCheminformatics 2003, November 10-15, 2003