Efficiency of a hierarchical protocol for high throughput structure-based virtual screening on GriD5000 cluster grid

publication · 10 years ago
by Leo Ghemtio, Emmanuel Jeannot, Bernard Maigret (Nancy Université, LORIA)
Most modern computational techniques in the drug discovery areas put demands on large computer resources. Grid computing offers a powerful alternative way of running computationally intensive applications. One field of the drug innovation process that can benefit greatly from the use of grid resources is the high-throughput virtual screening approach for docking huge chemical compound libraries into known protein-binding sites. The use of computational grids is the combination of computer resources from multiple administrative domains, heterogeneous, and geographically dispersed applications to a common task that requires a great number of computer-processing cycles or the need to process large amounts of data. This study detailed a screening campaign, on Grid5000 cluster grid computing infrastructure, concerning the ZINC database, from which a subset of ~600,000 “drug-like” molecules was extracted, against three structures of the liver-X receptor ß (LXR ß). A funnel strategy was used for that purpose, starting from a fast but simple shape matching procedure and achieved with more complex molecular dynamics simulations. From a total of ~91 million three-dimensional conformations which were generated at the beginning of the funnel and after intermediate filtering steps, the process ended with 45 putative hits. The GRID5000 is a highly reconfigurable, controllable, and monitorable experimental cluster grid, connecting nine sites geographically distributed in France, and featuring more than 3,200 processors and 5,700 cores. To hide the complexity of the grid system from the user, the GRID5000 has been used through the virtual screening manager for grid computing (VSM-G) platform, dedicated to in silico screening and to provide maximum computing power by using grid resources efficiently. The whole screening process required around 82 days (78 days of pre-processing and 3.6 days for the docking funnel itself) and utilized 3,144 nodes over nine sites. The use of grid infrastructures and hierarchical filtering protocol enable us to perform evaluations of the binding capabilities of millions of compounds on several conformations of a given target and propose that, with a low cost, most promising compounds for in vitro tests.
Visit publication