Compound databases and project management with InstantJChem and JChem

presentation · 11 years ago
by Gert Thijs (Silicos)
Instant JChem

Silicos is a chemoinformatics-based biotechnology startup focusing on early-phase drug discovery. We have initiated several in-house drug discovery projects and also provides services in the field of virtual screening and de novo design using its proprietary technologies, SpectrophoresTM and CosmosTM. In this talk we would like to give some insight in the use of Instant JChem and JChem in our day-to-day research and project work. To support our in-house development, we needed to set up a small, but rapidly expanding, compound registration system in which we could store both purchased and synthesized compounds. The first goal is to be able to keep track of where compounds came from and how much and in which state they are stored. But more importantly the system should also be easily extensible with biological data as the compounds are tested in the lab in the different projects. To facilitate the maintenance of the database we developed several small programs to upload new compounds and keep the data consistent. In case of our virtual screening activities we created a large database of more than 6 million commercially available compounds, called Simosa. These compounds have all been processed in a standardized way to make them ready for deployment with our virtual screening tools on our cluster. In this setup we need to have easy access to the vendor data of a specific compound and to find structurally similar compounds. The main challenge here is to keep the database up to date with new vendor catalogs and keep track of the older and unavailable compounds so that we can easily order compounds. Since the database is very large it is important to provide an interface to the modelers where they can create subsets of compounds which can be handled more conveniently on their local laptop with Instant JChem. To make these databases accessible we combined both the power of Instant JChem and JChem to create small command line programs that can be easily called from our scripts to query the databases.