Chemistry and the communication and sharing of chemical ideas are facilitated by the availability of a universally understood lingua franca - the chemical structure diagram.
The number of potential chemical structures is estimated1 at 1060 and smaller interesting subsets are either commercially available as “on demand big chemical spaces” - ranging from Mcule 1.4x108 to eMolecules 7.0x1012, or created by pharma companies - Eli Lilly 1011 to GSK 1026.2 Many of these compounds are virtual, and their structures - if needed - will be drawn algorithmically: but that leaves a huge number of structures that already have been, or will need to be, drawn by a human.
So what would the ideal chemical drawing tool look like? Rather than list all the features and functionality in a perfect application, this blog post looks through the lens of some typical users of structure drawing tools, coupled with the tasks and workflows where they need to draw structures, to see how an ideal tool could meet or even exceed their needs.
We have chosen four typical users: Marco, a medicinal chemist in a pharma company; Carla, a computational chemist in pharma; Rex, a registrar and cheminformatician, also in pharma; and Ali, an associate professor in academia.
Marco, the medicinal chemist, lives and breathes chemical structures, helping them evolve through the DMTA cycle from initial molecular designs, through synthesis planning, reagent acquisition and synthesis, and on to bioassay and test results visualization and analysis.
During the design phase he wants to quickly sketch ideas for likely structural motifs and substituents, so speed and ready availability of customizable template structures and shortcuts make things easier. Markush enumeration can create large numbers of structures and reusable frequently-used scaffolds can be readily decorated with substituents in the design phase. At the same time, instant access to relevant structure-based properties (e.g. LogP, pKa, H-bond donors, 3D structure, etc.) and clean links to spreadsheet and other visualization tools can help with filtering and sorting of promising candidate structures.
Medicinal Chemistry
In the planning part of the make stage, Marco might augment his chemical synthesis knowledge by searching in large repositories of published chemical reactions and by using retrosynthetic analysis tools to look for possible synthetic routes. Some of these commercially available systems have their own embedded chemical drawing tools, and rather than learning a new interface and set of menus, Marco would prefer to use his favorite tool to draw reagents and products and seamlessly transfer them to the third party search system with no loss of chemical intelligence. He has the same favorite tool preference when he searches consolidated chemical supplier databases for starting materials, catalysts, etc. While exploring related sets of compounds, Marco needs a rich set of substructure search tools to home in on likely building blocks and reagents.
With reagents and a synthetic route in place, Marco moves to the execution step of the make stage. This involves entering the reaction details into an electronic lab notebook (ELN) and as with the search systems, commercial ELN systems may have their own built-in chemical drawing tools, and again he would be more comfortable using his preferred tool seamlessly in the ELN.
Once the reaction is complete and the products have been characterized (aided by comparing actual and predicted NMR and MS spectra), Marco wants a clean and seamless link (with no redrawing) to the corporate registration system for the structures (including generating a correct IUPAC name), and an inventory system for the samples and batches. He can then order bioassays for the registered compounds in the test phase, and await the results.
In the analyze phase, Marco uses a variety of structure-activity relationship (SAR) tools to investigate which structural classes and variants show promising activity, and which are to be avoided. Tabular views of live structures, scaffolds, and substituents coupled with assay results and other structure-derived properties such as DMPK characteristics can be explored and visualized using R-group decomposition and substructure searching: and these require a rich set of chemical query feature drawing tools to detect the most promising lead candidates.
Computational Chemistry
Carla, the computational chemist, works alongside Marco in the design and analyze phases, suggesting and validating possible structural motifs to explore (or avoid) and conducting in-depth SAR studies on bioassay and DMPK results. She shares Marco’s requirements for speed of drawing, with ready access to customizable shortcuts and templates, a rich collection of substructure search capabilities (e.g. generic atom and bond types, homology groups), and links to property calculators, predictors (e.g. pKa, aqueous solubility, logP), and 3D structure generators.
As she works with a number of commercial and in-house developed drug design packages (e.g. QSAR, ligand- and structure-based design, docking, virtual screening, and scaffold-hopping) she would prefer to use her favorite drawing tool in all of these, rather than having to learn several different GUIs.
And as she often needs to move large files of real or putative structures from one package to another, she needs a drawing tool that can cleanly export and import all the commonly used structure file formats (e.g. molfile, SDfile, cdx, SMILES/SMARTS, InChI/InChIKey, etc.) with no loss of chemical intelligence.
Registration and cheminformatics
Rex, the registrar and cheminformatician is responsible for maintaining the corporate registry and his key priority is to ensure that chemists like Marco can accurately and easily draw all the types and classes of chemical structures of interest to his organization - hypothetical, virtual, synthesized, or acquired - for representation in the registry.
This includes drawing small molecules, peptides, polymers, organometallics, biomolecules, and partial or unknown structures. It also extends to structural variations such as salts, solvates, stereoisomers, tautomers/mesomers, addition compounds, mixtures, and alternates, which need to be accurately and consistently represented.
Much as Rex trusts his chemists to draw excellent structures, he expects the drawing tool to step in as needed to check drawings and flag errors (e.g. incorrect valence, unbalanced charges, non-adherence to corporate business rules) that the chemists need to correct. Rex (and the chemists) also rely on the drawing program to generate correct IUPAC names for their compounds, and any other required descriptors such as InChI or InChIKey.
Rex is often tasked with helping his colleagues in the patent department as they create patent filings, so he needs sophisticated substructure search capabilities to search the registry and locate structures falling with a patent claim. This includes variable chain and ring sizes, alternative substituent positions, alternative functional groups and hetero atoms, not-groups, and nested R-groups.
Education and publishing
Ali, the associate chemistry professor, needs an ideal drawing tool to quickly and accurately prepare a range of teaching materials such as presentations, tests, quizzes, and hand-outs. She shares many of Marco, Carla, and Rex’s requirements, but has additional pedagogical structure needs as she describes molecular shape, orbitals, charges, lone pairs and radicals, along with electron transfer arrows to illustrate and explain reaction mechanisms to her students. Coloring and other structure enhancements such as perspective views and adjustable bond length and thickness are also useful in teaching.
In addition to teaching, Ali writes publications describing her research, and needs a drawing tool that can create correct and esthetically-pleasing structures, reactions, and reaction schemes with “publication quality.” This in turn requires adherence to stored journal styles, and fine control over orientation and layout. Equally important is complete and clean integration with other publication tools, especially MS Office. Robust cut/paste of structures between documents with complete retention of chemical intelligence for live editing is crucial, as is compatibility with other drawing tools’ formats.
Summary
If you asked a chemist to describe their ideal chemical drawing tool, you might get an answer that included some of the assorted features and functions described above: but their selection would likely be influenced by the nature of their job and how it involves chemical structures; and by the immediate tasks and workflow processes that require them to draw structures. We have cast a wide net in this post with a representative variety of users, workflows, and tasks to envisage an ideal chemical drawing application and wonder how close this vision comes to yours?
(1) Acc. Chem. Res. 2015, 48, 3, 722–730
(2) Current Opinion in Structural Biology 2023, 80:102578
What is your ideal chemical drawing tool?
We are eager to understand what makes a drawing tool really stand out. Let us know by filling out this 4 question anonymous survey!