11th International Conference on Chemical Structures

May 27 – 31, 2018 · Noordwijkerhout, The Netherlands

Finding answers from chemical space extremely fast

A. Tarcsay, G. Imre, A. Volford (ChemAxon, Budapest, Hungary)

The complex nature of chemical graphs offers an immense source of variability for drug designers to tackle optimization challenges along the project pathway towards candidates. The difficulty lies within the exploration of the chemical space either by chemical intuition of medicinal chemists or by using enabling technologies, like cheminformatics tools. Real and virtual chemical spaces encompass broad scale of compound numbers and a vast potential to be exploited. An especially valuable sub-group is where measured data exists and stored, most commonly in relational databases. In our study both types, a very large compound collection and a medium size with extensive assay data were evaluated. As a read-out we used the cost associated with finding an answer for chemical questions, the search time. In the first use-case, the aim was to suggest novel analogues of known drugs using the largest publicly available enumerated compound collection, the GDB-13 counting 977M unique entries. This collection was screened with ultra-fast similarity search technique, using a subset of marketed drugs, where ~4 sec elapsed search time was measured constantly on a commercially available server (EC2, r3.8xlarge) using standard 1k fingerprint. Top 100 most similar compounds were cross filtered with the database of exemplified structures from patents (SureChEMBL DB) to fetch novel moieties with higher tendency to be in freedom to operate space (Fig. 1.). In the second part search performance on the entire data from ChEMBL DB was measured with three search types (duplicate, similarity and substructure) and joined queries. These joined queries represent complex questions asked from data warehouses in pharmaceutical industry, where performance is a key indicator due to massive load. The aim is to provide realistic speed statistics measured with chemical cartridge extending Oracle and the new generation engine running on PostgreSQL. Significant speed up was measured using the new search engine, especially on combined queries, where 100x speed up was achieved and median search time was in a range on ~100 milliseconds falling below the recognition time limit.

Figure 1.Example drug and its novel analogues identified from GDB-13. Tversky dissimilarity >0 rules out substructure match in SureChEMBL.

About the International Conference on Chemical Structures Conference:

The 11th International Conference on Chemical Structures (ICCS) will take place in 2018. It will continue a well-established conference series that begun in 1973 as a workshop on Computer Representation and Manipulation of Chemical Information sponsored by the NATO Advanced Study Institute and thereafter was held under its new name every third year starting in 1987. The 2018 conference will build on the experience of the past successful editions to offer a strong scientific program which covers all aspects of cheminformatics and molecular modeling, including for example structure-activity relationships, virtual screening, modeling metabolite networks, etc. Participants discuss research as well as relevant technological and algorithm developments in handling and visualization of chemical structure data, workflows for complex cheminformatic analysis and machine learning. The conference fosters cooperation among organizations and researchers involved in the increasingly interwoven fields of cheminformatics and bioinformatics and combines in-depth technical presentations with ample opportunities for one-on-one discussions with the presenters.

The 11th edition will be held, one year out of phase with the intended triennial frequency, from 27-31 May 2018 at the beautiful Conference Center in Noordwijkerhout, The Netherlands.

The conference is jointly supported by:

  • Division of Chemical Information of the American Chemical Society (ACS)
  • Chemical Structure Association Trust (CSA Trust)
  • Division of Chemical Information and Computer Science of the Chemical Society of Japan (CSJ)
  • Chemistry-Information-Computer Division of the German Chemical Society (GDCh)
  • Royal Netherlands Chemical Society (KNCV)
  • Chemical Information and Computer Applications Group of the Royal Society of Chemistry (RSC)
  • Swiss Chemical Society (SCS)