Calculator Plugins (logP logD pKa etc...) Presentation

Automation of building reliable models

Posted by

on 2021-09-13

Calculator Plugins (logP logD pKa etc...) Presentation

Automation of building reliable models

Volume and velocity of bioactivity data available in public or in-house sources represent an immense opportunity to be exploited in novel compound design. Wider and wider array of targets with labelled data necessitates efficient solutions to build a large number of individual models. Velocity of data growth provides the possibility to yield higher accuracy through continuous re-training of the existing models. Automatic re-training maximizes the applicability domain and minimizes the risk of accuracy drop while a project expands into novel chemical series. Based on the recognition of these requirements we launched a project to develop an automated solution for model building relying on ChemAxon chemical toolkits and Smile Java library.

Validation of the prediction power and reliability is a key factor in case of machine learning. In order to give an estimation of the prediction error we implemented and tested the conformal prediction framework. Applicability domain calculation based on chemical and descriptor space similarity were introduced to provide a tool that supports the assessment of the predicted values. Summary of descriptor selection, machine learning algorithms (RF, SVR) and hyperparameter optimization for a bioactivity data set including >150 ChEMBL targets will be presented. This pool varies in size (from hundreds to thousands) and covers a large spectrum of pharmaceutically relevant targets. Our results showed 0.8< median Pearson correlation value for these targets measured on the test sets. hERG ion channel inhibition is one of the most important safety related off-target. Related liabilities are to be recognized and filtered out early on during drug design. As a case study we present detailed results on hERG model development.

Facebook Twitter LinkedIn

Copy to clipboard Copy link

Marvin

The new Marvin is a universal chemical editor that serves the needs of any chemist involved in research and drug discovery.

Design Hub

Your molecular design and tracking platform turning drug discovery into a team sport.

Compound Registration

Compound Registration compares the uniqueness of new small molecules against those already stored in your database.

Design Hub

Automation of building reliable models

Automation of building reliable models

Related content

Predicting pKa

Calculate on the cloud

ICCS 2022 - Translating data to predictive models

Cheminfo Stories Virtual UGM 2021 Asia Pacific Edition: Deep dive in the future of chemical patent drafting and in-house IP management