Cheminfo Stories 2021 Virtual UGM | Building a Structured Toxicology Knowledge Base from Unstructured Data

Posted by
Daniel Mucs
on 13 11 2021

Information gathering during toxicological risk assessments is usually a quite time-consuming manual process, especially if it needs to be done for a large number of substances at the same time. For toxicologists to be able to perform even an initial high-throughput hazard screening step, the linkage of substance (chemical) information with toxicological (biological) data in a well-structured manner is essential. This can be automated to a certain degree using workflow tools such as KNIME, however it requires area specific expertise – such as cheminformatics -, and even then, the chemical information can be difficult to handle on a workflow-by-workflow basis. Access to historical data generated using these workflows, including an audit trail of how the data was generated, is also a key requirement. Ideally all toxicologically relevant data should be accessible and manageable via an easy-to-use front- end, supported by a robust back end handling the structured and unstructured chemical and biological data. Here we present our story of how we embarked on the journey of building a toxicology knowledge base with ChemAxon technology starting from unstructured data.