Conclusion Attendees and Social Events
Overview of ChemAxon by Alex Drijver The ChemAxon Product Profile
Biomolecule Toolkit Plexus Suite
Compliance Checker ChemCurator
Markush technology Marvin JS
Consultancy Partner Session
Boehringer Ingelheim Bristol-Myers Suibb
GlaxoSmithKline Genentech
DuPont Novartis
Merck Cubist
Flately Discovery Kyoto Constella Technologies
Chemalytics Pearson Education


I will echo the conclusions of Wendy Warr in her report of the Budapest UGM, specifically that ChemAxon has emerged as the leading provider of chemical information software, both toolkits for the programmers and programs for end-users. It accomplished its pre-eminence with solid reliable toolkits and applications, both coupled with innovations geared to solving clients’ problems. This UGM also showed ChemAxon’s commitment to staying ahead of the changes in customer’s software and research environments by developing Plexus Suite, the web platform for intra- or inter-company collaborative research, and Marvin JS that puts chemical drawing and searching capability on any device independent of operating system. Of course, these new developments build on the support of excellent toolkits; dedicated consultancy staff; and applications (too numerous to mention) that check chemical structures for errors or legal entanglements, or that find chemical structures embedded documents written using the English alphabet or Chinese or Japanese scripts. ChemAxon’s Markush technology has found a place not only in patent preparation and searching, but also in applications that help medicinal chemistry teams follow their series and combinatorial chemists to design their libraries. Although they were not discussed at this UGM, ChemAxon is noted for its excellent tools involving chemical reactions including metabolic transformations and its reliable chemical property predictions.

It was exciting to see clients or partners integrate ChemAxon technology into applications that were not anticipated in the original design of the tool. For example, Document to Structure and associated ChemAxon tools add chemical structure searching into natural language processing systems; Markush tools provide scientists with an overview of the chemical series investigated for a project; and Document to Database is used in several in-house systems to organize knowledge stored in documents of many types.

Attendees and Social Events

· return to TOC
Seventy seven non-ChemAxon people registered for the meeting. They were from 11 major pharmaceutical or chemical companies (often with more than one person from a company); 15 vendors; and 12 small biotech companies. To encourage customer feedback, the 26 ChemAxon attendees included all those responsible for a product.

The UGM was held in the Royal Sonesta Boston, located on the Charles River in Cambridge. Before the meeting there was a mixer on the patio overlooking the river and the lights of the city. After the first day the group adjourned to Flat Top Johnny’s, a trendy pool hall in 1 Kendall Square, which is a center for high-tech firms.

Overview of ChemAxon by Alex Drijver

· (Presentation in our Library) ·return to TOC

To open the UGM Alex Drijver presented the vision of ChemAxon entitled “From the Back End to the Mobile Device”: “Taking our world leading chemistry representation coupled with its enterprise grade platform (robust, scalable, proven) and delivering solutions to the scientist in a modern, flexible, configurable, intuitive user interface.”

Key elements of this vision are the best chemistry representation coupled with the industry-leading back end; adapting ChemAxon’s solutions to the changing needs of its customers; providing the infrastructure to enter data where it is generated including on a mobile device; and supporting the needs of collaboration between companies. This represents an evolution in the mindset of ChemAxon from toolkits to solutions. Alex emphasized that enhancements to ChemAxon capabilities will be guided by the needs of groups of customers.

Coupled with this vision of providing solutions is the recognition of the many markets that are served by ChemAxon technology: major life science, agricultural, fine chemical, and petrochemical companies; biotechs and academic institutes; CROs and service companies; publishing, databases, and learning; and integration partners.

By committing 30% of revenue to new products, ChemAxon has developed exciting developments to support this vision: Marvin JS, a chemical drawing browser component; the Plexus Suite, a comprehensive web-based chemical spreadsheet and scientist user interface; Document to Structure/Document to Database, which supports searching for and indexing structures in disparate or unstructured data sources; Biologics editor and methods to standardize, search, index, and register biologics; enhancements to the Markush technology with focus on extraction of Markush structures from documents; Compliance Checker to check any legal entanglements on a proposed compound; supporting open innovation with a full turnkey solution; and ChemCurator of IP analysis involving extracting, visualizing, and searching chemical information from patents.

Alex summarized his vision with a quote from a customer “looking to the future, I don’t know anyone else who I would trust my chemistry with … other than with ChemAxon”.

The ChemAxon Product Profile

· (Presentation in our Library) ·return to TOC

A set of talks further elaborated the vision for the evolution of the ChemAxon product profile. Doug Drake provided the overview, reminding attendees that the vision of ChemAxon is to “enable scientists to manage their chemical data via a modern and cost effective suite of informatics tools and applications that have been developed together with customers and partners.” Of the 140 staff, 75% are in product development. To support the changing needs of customers ChemAxon releases new products and three product updates every year. These new capabilities are designed in partnership with users.

ChemAxon sees the change in the industry landscape from big company silos to multi-organization collaborations. There is also a need for modular, configurable, powerful visualization; more facile integration with third party tools and databases; and shifting technology away from Java and Oracle.

A conceptual map how ChemAxon products are used is shown in Table 1. Note that many of the products serve more than one function: the Plexus Suite encompasses all functions. Another example is Instant JChem, while it is designed for storage of chemical structures and associated data; it also provides tools for data analysis and reporting. On the other hand, Structure Checker/Standardizer is used only during storage and Compliance Checker only during analysis. Subsequent talks elaborated on how the products serving the various functions have evolved to better serve the functions.


div class="align-center"> <a class="lightbox" href="href="">ChemAxon portfolio

Table 1: ChemAxon Portfolio

Data creation, registration, and evaluation · return to TOC

András Strácz reviewed the current and new ChemAxon capabilities for data creation, registration, and evaluation. Drawing chemical and query structures and chemical reactions is a key element of data creation. The newly released Marvin JS provides a chemistry-aware chemical editor. It has enhanced support for R-group definition including homology groups and attachment points. It also supports the entry of chemical reactions with manual or automatic reaction mapping, electron flow, and lone pairs. Enhancements to the definition of query structures include the ability to specify atom or bond query properties. Structures can be entered into Marvin JS via import dialog or by pasting from MarvinSketch, ChemDraw or simple pasting a structure file, InChI, CAS number, name, or SMILES directly onto the canvas. Lastly, ChemAxon introduced a full Markush editor that will be of use as a desktop application for combinatorial chemistry. The biomolecule editor in development will support the drawing and registration of complex biomolecules.

András then described how one can load data into the system. The three copy functions formerly in JChem for Excel have been reduced to one button that is aware of the type of data to be copied. It is simple to import data and structures from JChem for Excel or MarvinView into Instant JChem. The Document to Structure suite, which can be opened in any ChemAxon application, recognizes in documents the chemical structures identified by a name, InChI, SMILES, or corporate ID. The chemical recognition capabilities are extended with optical character recognition with error handling designed for chemical names and optical structure recognition. Chinese and Japanese Name to Structure are recent additions to the capabilities of Document to Structure. The application also processes Markush structures; the processing can be automatic or user-supervised.

Data storage · return to TOC

Petr Hamernik then described how chemical structures are corrected and stored in JChem Base and JChem Cartridge. Structure Checker/Standardizer ensures that the structures entered represent a possible chemical compound and that the drawing conforms to pre-established rules or conventions. Special attention is paid to tautomer handling. These capabilities are also part of the Registration system. Registration can also be integrated with ELNs. There are also enhancements to Compliance Checker, an application that checks if a compound is covered by any regulatory documents.

The new application, Plexus Suite, is integrated with Instant JChem which is an administration tool and gives an interface to JChem Base and Cartridge and the Registration package, which in turn can reach Structure Checker/Standardizer. By integrating the Java and .NET APIs and web services Plexus operates on all operating systems, accesses all ChemAxon databases, and provides the same results to the user, independent of the hardware used. Support for the PostreSQL cartridge will be released soon.

Development of Instant JChem also continues. It can now access the Microsoft SQL server; administrator tools have been enhanced; and query features are now selectable and configurable. Row level security has also been added. There are plans to enhance the user experience by reducing the number of clicks necessary for some common tasks and to provide better support for developers.

Analysis and Refinement · return to TOC

Miklós Szabó reviewed the ChemAxon tools for Analysis and Refinement. In addition to company customers, these tools are used in the Lilly Open Innovation in Drug Discovery, the European Lead Factory, and AstraZeneca Open Innovation. In particular, the latter three use ChemAxon tools for collaborative enumeration, filtering, novelty checking, and refinement of chemical libraries. The library analysis involves measuring the overlap with known compounds, for example those at Reports can be MS Office documents, on the web portal, files such as sdf or pdf, email, or SharePoint. A new water solubility plugin is available to use as a filter. A key capability for collaboration is the support of user privileges and roles.

In contrast to the design of libraries discussed in the previous paragraph, medicinal chemistry projects are typically iterative, with design cycles initiated when new biological test results are available. Again the structure filters, JChem for Excel, Instant JChem, Compliance Checker, and sometimes Metabolizer are useful. To provide more informative graphics, connections of Instant JChem to Viz and Spotfire visualization programs are now available. Text searching in Instant JChem now results in the query being highlighted in the hits. Also new in Instant JChem is the capability to sort or query calculated fields. In JChem for Office with a single click one can prepare an SAR table. As in Instant JChem, structure and property filtering in Excel are supported. Medicinal chemistry projects often use Metabolizer to prioritize compounds for synthesis. Metabolizer not only predicts sensitive sites, but also predicts metabolic transformations and detects possible metabolic pathways that might involve a compound. Lastly, medicinal chemistry design often considers the 3D properties of the molecules. For this purpose Screen3D is highly effective.

Reporting and Sharing Information · return to TOC

Aurora Costache discussed reporting and sharing of information. Information to be shared might be internal reports, scientific publications, patents, global data, knowledge, discussions, and collaboration.

MarvinSketch is useful for molecule and reaction reporting. Particularly relevant is that abbreviated group labels can be formatted and display of M/P flags for atropisomers will soon be released (released in version 14.10.20).

Instant JChem provides support for global data management, collaboration, and data contained in reports and patents. Reports can be in the form of templates for the database, or exported .pdf or Microsoft office files. Instant JChem now supports enhanced charting, for example, gathering data from two related tables, a palette of widgets to aid design of templates for data display, and rich text and HTML support. New usability features include filtering and sorting in project explorer, user-friendly names for lists and queries, and sorting and querying in calculated fields. In addition, it will provide the ability to query time periods, for non-administer users to create calculated fields, new tools for administrators to manage test/productions environments. When data is exported to Excel it will keep the formatting from Instant JChem.

JChem for Office provides support for knowledge sharing; accessing data present in and creating reports, scientific publications, patents, and publications. One can populate a JChem for Excel file by searching a JChem database or providing a list of IDs. Data selected for importing from Instant JChem need not include chemical structures. The JChem for Excel capability to convert ISIS, SymyX, ChemDraw, and Accord Excel files has been speeded up considerably. Structures in a JChem for Excel file can then be analyzed in an R-group table. One can save the file in a format that can be seen by Excel users who do not have JChem for Excel.

Support for publishing is provided by Marvin, Naming, and JChem for Office. Structures can be saved as images, IUPAC or common names, our structure drawing in journal style formats.

Support for patent preparation is provided by Marvin and Naming. In particular, R-Group Decomposition leads to Markush structures and, for the exemplified structures, Structure to Name provides IUPAC names. The resulting Patent document can be stored as .sdf or .mrv files or in a database.

Instant JChem supports sharing in large organizations by providing a url to results or sharing forms to view database data. On the other hand, JChem for SharePoint provides chemical structures in lists, blogs, discussing boards, wikis, and live sessions shared by collaborators. Within JChem for SharePoint one can do a chemically aware search of databases, file shares, and reports repositories. The Plexus Suite and JChem for SharePoint support global data management, knowledge sharing, discussions, and collaboration.

The Biomolecule Toolkit

· (Presentation in our Library) · return to TOC

Roland Knispel reported on the in-development ChemAxon’s Biomolecule toolkit that extends the JChem platform to biologics. It adds registration, search, and property predictions of complex biological molecules such as oligonucleotides, proteins, antibodies, antibody-drug conjugates, etc., including those that contain unnatural and chemically modified components. The toolkit provides a standardized representation of such molecules, supporting an all-atom description yet preserving sequence-based information.

A browser-based client uses the HELM editor and Marvin to define monomers and start the registration process. It recognizes sequence, FASTA, any format recognized by Marvin, and HELM/XHELM formats. The definition, editing, and rendering of biomolecules are facilitated by embedding the open source OpenHELM components. Structures can be viewed as full 3D, all atom, schema, or sequence. Sequences can contain unnatural residues and modifications. Schema provide for custom nonstandard representation. A canonical representation of the structures aids identifying duplicates that had been represented differently. The application also allows one to enumerate libraries of a starting sequence using rule-based sequence replacements.

The biomolecule registration database is linked to a JChem Base monomer database. The browser client also supports database querying for subsequences, chemical modifications, and non-natural residues, as well as substructures. The biomolecule toolkit is integrated with ChemAxon Registration, JChem Base, Plexus Suite, Instant JChem, and JChem for Office. The toolkit is available for early phase testing as a SOAP or REST web service API.

Plexus Suite

· return to TOC

Plexus Suite Overview · return to TOC

Miklós Szabó introduced Plexus Suite, the ChemAxon Collaborative Research Platform. The design is based on interviews with customers as to the problems they have with ChemAxon and other software. They identified the following: that current apps are too complex and non-intuitive; it is difficult to access data from multiple sources; project data included in emails is lost; collaboration tools are lacking; data sharing is difficult; ELNs are difficult to search and maintain; and access to 3D tools is often limited. With these problems in mind, the goal of Plexus is to provide a simple and intuitive user interface with context sensitive menus and business and feature toggles.

Although cheminformatics is “solved”, it needs to be integrated into the user’s workflow. For example, medicinal chemistry projects typically involve a cycle of chemical synthesis, biological testing, analysis of the results, and enumeration and filtering of suggested follow-up compounds. Accordingly, the Plexus Suite Modules include Plexus Design (enumerate, calculate, cluster), Plexus Registration to register compounds and query those already registered, Plexus Analysis (to be released in early 2015; charting, pivot and aggregate, SAR) as well the Plexus Connect which will allow one to view and query relational data.

Plexus Suite supports collaboration with role-based login and access to forms, row-level security, the ability to share queries and lists. Integration with Microsoft Office is part of the design. (Presentation in our Library)

Plexus Connect · return to TOC

David Deng continued the discussion of Plexus Suite with more focus on Plexus Connect. The user logs in using a simple web form. From there one can search databases, reuse a saved query, etc., making it ideal for end-users. Instant JChem is the administrative tool for Plexus Suite.

Daniel Bonniot continued the Plexus Suite presentation by describing the relationship of Document to Database to Plexus Suite and how it expands the utility of Document to Database. As the name implies, Document to Database process documents of various types in various locations and formats and creates a database of the chemical structures and the context in which the structure is found. This database typically has forms for viewing the data and built-in security. Plexus Connect uses these forms and security to present the data and search capabilities to the end-user. (Presentation in our Library)

Plexus Design · return to TOC

Ágnes Peragovics continued the Plexus Suite presentation with more focus on Plexus Design. Designed molecules can be enumerated from a Markush structure or from a reaction. For Markush the user controls how many molecules will be enumerated. In reaction enumeration, the user can specify a database to search for appropriate starting materials or search for reactions to use with known reactant types. Plexus Design can be integrated with StarDrop or LiveDesign to further filter the designed molecules. (Presentation in our Library)

Plexus Analysis · return to TOC

She then discussed Plexus Analysis, still under development and to be launched in 2015. It provides property viewers for a single molecule to show how it compares with the properties of all the molecules that have been selected or enumerated. Various data analysis techniques will be developed in order to provide abilities for structure and activity relationship studies. (Presentation in our Library)

Plexus Registration · return to TOC

David Deng then discussed Plexus Registration. It is a key component of the workflow analysis/design, enumeration, and synthesis and the workflow biology, analysis/design and filtering.

He then recapped the Plexus Suite presentations by pointing out that Plexus accesses great cheminformatics, extensive toolkits, and unique areas of ChemAxon expertise such as Markush and Asian language support for chemical names. It does this to extend chemistry to a new platform, one that needs no local installation and is independent of operating systems and Java. In addition, it can be used on other platforms such as touch devices. Because it is web-based, Plexus is easy to integrate with third party software and supports easy information sharing.

Plexus Suite development will continue with further Modules being released in 2015 and project management and synthetic chemistry farther down the pipeline. The priorities for development depend on user comments. (Presentation in our Library)

Compliance Checker

· (Presentation in our Library) ·return to TOC

Compliance Checker was developed together with the Pistoia Alliance to identify compounds that are subject to legislative rules concerning controlled substances. They may be illegal drugs and certain classes of prescription medications. Local, national, and international laws cover the production, import/export, supply, use, and possession of controlled substances. National laws of different countries often differ in their restrictions. Compliance Checker was designed to robustly support compliance regardless of geographic location while maintaining flexibility to adapt to changing legislation.

It is built on ChemAxon’s JChem Base using an application marketed in Japan since 2006 and used there by more than thirty companies. It currently covers regulations from North America, India, Japan, and most of the European countries. Compliance Checker performs a series of superstructure searches to identify controlled substances. These searches are finely tuned to include as hits those specified as “Any ester or ether of …” as well as “and not…” It can be used beyond controlled substances, for example to detect toxic or dangerous substances or to detect those that are subject to environmental protection laws.

The Compliance Checker Server can be access via a web client for a quick or batch check, a Windows client for batch processing of large numbers of compounds, a SOAP interface for access from external systems such as ELNs or Registration, and command line for access to batch processing and incorporation into Pipeline Pilot or KNIME. All access the same check engine.


· (Presentation in our Library) ·return to TOC

Árpád Figyelmesi and Daniel Bonniot de Ruisselet introduced ChemCurator, an application for chemical document curation and management. Knowing the relevant patent literature is an essential component of successful drug discovery. However, existing databases are often incomplete or of questionable quality; manual processing of patents of interest is slow and requires considerable expertise whereas automatic processing has yielded questionable results; and users want to be able to visualize and analyze the results of a patent search.

To solve these problems, ChemAxon has developed a number of data extraction tools. Name to Structure now recognizes names in English, Chinese, and Japanese. The names can be common, drug, or IUPAC. Japanese was added in 2014, following English in 2008 and Chinese in 2013. Comparing the overlap between the English and Chinese patents of the same compounds validated the Chinese Name to Structure.

The process starts with annotation of files in XML, PDF, HTML format or direct import from Google Patents or IFI Claims. Optical structure recognition with third party tools (CLiDE or OSRA) is also supported. Markush structures and specific structures can be extracted from patents, journal articles and company reports. The built in Markush Editor and Markush validation are key to the success of extracting Markush structures from patents.

ChemCurator can exchange data with other ChemAxon products via an Integration Server using Instant JChem (IJC) schema for storing the data. The extracted chemical information is now available from Instant JChem and it is also accessible from Plexus. Because ChemCurator uses standard file formats it easily complements third party tools.

Markush technology

· (Presentation in our Library) · return to TOC

Árpád Figyelmesi elaborated on the expansion of capabilities of the representation of Markush structures. Already in place were the ability to handle R-groups, atom and bond lists, position variations, link nodes, repeating units, and homology groups. These technologies are used in Markush substructure, full structure, and duplicate searching; in Markush enumeration, random, subset, or related to a query; R-group decomposition; and Structure Checker. ChemAxon has recently added the ability to describe R-group bridging, for example the possibility that two R-groups could link together to form a ring.

A Markush editor was recently developed. It shows a tree view of the Markush; the scaffold and R-group definitions; and a preview of enumeration. Also new is the first version of Markush Composer that generates a Markush from a set of compounds. ChemAxon has also developed a tool to measure the overlap between two patents. It shows the percentage of overlap and the Overlapping Markush—all of this with no enumeration and no limitation on the number of compounds represented by a Markush. Coming soon will be Markush standardization, similarity search, and visualization. There are also plans to enhance the representation to include such definitions as “R1 contains at least one nitrogen atom”.

Marvin JS

· (Presentation in our Library) · return to TOC

Efi Hoffmann discussed Marvin JS, the result of the major effort to create a fast, smart, easy-to-use chemical drawing web component. It is part of ChemAxon’s commitment to balancing usability and power for creating chemistry on the web. She reminded the group that Marvin started as a Java application, then was incorporated into .NET as Marvin Beans as well as the Marvin Applet. Marvin JS is not a simple copy/paste of the features of Marvin. Rather, the redesign seeks to capitalize on the special power of JavaScript, in particular its adoptability for touch devices. The goal is to design Marvin JS so that it is fast, intuitive, and incorporates great chemistry—this is kept on a server. Marvin JS includes basic chemical drawing, editing, and IO; drawing of reactions including full support for drawing mechanisms and electron flow arrows; support for templates, abbreviated groups, query and Markush structures; and basic 3D viewing. One can drag and drop .mrv or .cdx structures into the drawing panel. Reaction mapping can be performed automatically of manually. In the 3D viewer, one can rotate, zoom, and clean. Clean is done on the server.


· (Presentation in our Library) · return to TOC

Erin Bolstad described several consultancy projects, some simple and some more complex. The consulting team includes consultants and application scientists in the USA and Hungary, dedicated consulting developers in the Czech Republic and Argentina, plus help from the more than 100 ChemAxon developers. Their expertise goes beyond customization of ChemAxon software and integration of third party software to project management, solutions/packaging, software design and development, and toolkit development. Examples of completed projects include extensive support for migration to ChemAxon software: database migration, customized development, form and data-access design, and customized training. Other projects include customized developer and end-user training, customized thin clients, and integrating existing and in-house technologies.

For DuPont they implemented a Document to Database system. It trawls their stored documents and ELN for information. They designed a web application that supports complex queries and links back to the relevant ELN page.

The consultancy group updated the existing Novartis reaction database. This involved migrating from the legacy ISIS database and incorporating a live feed from the CambridgeSoft ELN. The resulting AJAX-style web application provides query and filtering. There is also a web services interface.

For GSK the group added the ability to view ChemAxon data and structures in Spotfire. It is planned that this will develop into a marketed product.

The group also developed a Customized Product Incubator, targeting small to medium-sized companies. It provides simple compound registration system within Instant JChem. For chemistry registration it handles the usual complexities of batches and samples. To assist handling of biological assay results it provides aggregation at the database level.

The group also provides large-scale project management for global pharma. For example, the group was involved at GSK for more than three years during which they managed the roll-out of Instant JChem as a global reporting tool, assisted and trained GSK personnel on the customization of Instant JChem, and created additional administrative software. At BMS they were involved for more than two years during which it provided customization and development for global roll-out needs, training, migration of data, and consulting on data-mart integration. In addition, they developed a thin client.

As a final example, Erin discussed their role of providing customized components for large collaborative initiatives. For example, the Innovative Medicines Initiative, part of the European Lead Factory, aims to boost pharmaceutical innovation in Europe by supporting collaborative research projects and building networks of experts. The European Lead Factory is an alliance of dozens of participating organizations that collaborate to identify novel leads. ChemAxon designed the applications for reviewing, and selection library proposals. The user interface provides support for workflows for three different types of user and different levels of access for the submitter, review committee member, and review committee chair. To submit a library, the scientist sketches the scaffold with Marvin JS, then either uploads an SDF of the whole library, uploads an MRV file with a Markush library, or picks R-groups and uses Markush Enumeration to generate the library. The scientist also enters the rationale and synthesis validation. The application then identifies related known compounds by doing a substructure search against a reference set of 12 million structures. It then filters the compounds against a set of SMARTS filters supplied by the EFPIA members and checks them with Compliance Checker. Lastly, it provides scatter graphs and histograms of calculated properties. At this point the scientist can hand over ownership of the library idea to the European Lead Factory. To prepare the library for evaluation, the application now generates a histogram of the most similar structure in the reference set for each compound in the suggested library. With this application ChemAxon reduced the time for such a comparison from days to less than a half-hour. The Library Selection Committee uses all of this data to score each library on molecular properties, structural features, novelty, diversity potential, synthetic tractability, and innovation. The committee chair is pleased with the web tool, stating: “The procedure for assessing and processing the proposals has been straightforward. The tool has …saved huge time for both library proposal submitters and members of the Library Selection Committee.”

ChemAxon software is also used in Open Innovation Drug Discovery, an Eli Lilly-led effort, and open innovation at GSK.

Concluding: the Consultancy Group provides product development, customization, and integration; workflow design and management; project management; and creative solutions to customer’s needs.

Partner Session

· return to TOC

ChemAxon has many integration partners, some of which made short presentations.

Agilent Technologies · (Presentation in our Library) · return to TOC

Antoni Wandycz presented their OpenLab software and informatics system that captures and analyzes data from their many analytical instruments; shares and protects content; captures, analyzes, and shares information in an ELN. It supports sharing standardized workflows, tracking of analyses, and a built-in audit trail. In early 2015 they will introduce a version for mobile devices.

BSSN Software · (Presentation in our Library) · return to TOC

Frank Itschert also addressed the challenges to integrate analytical data. There is an emerging ASTM XML standard developed by a consortium of industry, academic, vendor, and government bodies. In the interim, BSSN has developed support for most of the open and vendor formats to allow users to view analytical data on the desktop, in a web browser, and mobile devices. He showed preliminary views of spectra and chromatograms integrated into Instant JChem forms and tables and Plexus Suite.

Cambridge Semantics · (Presentation in our Library) · return to TOC

Richard Mallah described enabling chemically-aware semantic searches. The Anzo Pharma Information Workbench uses Semantic Web technology to provide tools for companies to build customized information-driven solutions for specific business problems. By incorporating JChem Base into the Anzo Enterprise Server, Marvin JS users can perform chemical searches across disparate data sources. The hits are returned with the relationships to other data, for example side effects of a drug or histograms or pie charts of properties of molecules that match a search query.

Certara · (Presentation in our Library) · return to TOC

David Lowis focused on their D360 product, which he described as a self-service data analytics tool. Any scientist can construct a data view without knowing the specific data location or format, without knowing Oracle or web services, and including data visualizations and analyses. Queries and data views can be shared. D360 supports JChem Cartridge and web services as well as ChemAxon Registration and Property Calculators. For the desktop they support Marvin and JChem for Excel.

Chemical Inventory · (Presentation in our Library) · return to TOC

Dann Vestergaard reminded the audience of the value of an inventory of chemicals at a site; it saves time and money if a needed chemical is already available.

Core informatics · (Presentation in our Library) · return to TOC

Jeff Noonan pointed out the accelerating trend of externalization of research. In addition, the rapidly expanding use of cloud services and tablets presents challenges for IT software. Core Informatics strategy is to design for mobile devices using a flat user-interface that is touch friendly. The application is hosted in the cloud or on-premises and delivered on desktops, laptops, tablets, and smart phones. It integrates instruments, samples, assays, context, and visualizations. A key ingredient of its ELN is its leverage of ChemAxon’s suite of tools.

DeltaSoft · (Presentation in our Library) · return to TOC

Diana Soto presented ChemCart, their integrated suite of applications for Discovery Research. For 18 years the company provided highly focused R&D informatics. Since 2005 they have integrated ChemAxon technology into ChemCart. ChemCart modules include reagent inventory, ELN, CRO collaboration, structure registration, sample tracking, bioassay support, and a browser of ChemCart for Excel to analyze structure-activity relationships.

KNIME · (Presentation in our Library) · return to TOC

Aaron Hart described the integration of JChem into KNIME’s graphical programming environment. Nodes representing JChem Cartridge/JChem Base, R-group decomposition, calculators, and Markush enumeration are available. Users can sketch structures into a workflow using Marvin JS. KNIME provides Bayesian modeling based on chemical hashed fingerprints. Their Distance Matrix Pair Extractor facilitates matched pairs analysis.

Lab-Ally · (Presentation in our Library) · return to TOC

Rob Day described their ELNs, which provide affordable data management integrated with ChemAxon tools. RSpace ELN is designed specifically for enterprise deployment at universities. Its user interface is designed to be simple, intuitive, and easy-to-use. This simplicity drives ready adoption. It uses MarvinSketch for drawing chemical structures or search queries. CERF ELN provides a compliant ELN for managing data in secure research. For example, it supports, tracks, and versions chemistry files and data but uses the native application to view or edit it. It features advanced semantic metadata and search technologies and integrates Microsoft Excel and Word in Windows machines. One can plan their work, document the findings, and fill in the data (including images), all in the CERF Notebook. Plugins display live, rotatable structures.

Linguamatics · (Presentation in our Library) · return to TOC

Susan LeBeau described their expertise in natural language text mining that extracts information and relationships from text. Their product I2E identifies entities and relationships between them. For example nimesulide inhibits COX2, which has Entrez Gene ID: 5743. It uses ChemAxon’s Name to Structure to find chemicals in text and produces human readable names for compounds via Structure to Name, thus ensuring consistent naming of the same structure. I2E uses ChemAxon tools to support substructure and similarity searching across millions of documents. It reports back information associated with the structures and can export this into JChem for Excel.

Optibrium · (Presentation in our Library) · return to TOC

Matt Segall presented StarDrop5™, software for small molecule design, optimization, and data analysis. The core philosophy of StarDrop is to discover molecules with the correct balance properties to optimize not only potency, but also solubility, drug safety (via Derek Nexus), metabolic stability (via QM calculations), and absorption. He showed their new card display that summarizes the important properties of individual molecules, but also can form part of a multiple-card display to show relationships between molecules. It is integrated with Plexus.

Schrödinger · (Presentation in our Library) · return to TOC

Scott Becket presented an overview of LiveDesign™, a browser-based interactive platform that supports collaborative design within a team of computational and medicinal chemists. It incorporates ChemAxon’s Markush Enumeration, Plexus Suite, and Schrödinger graphics display and docking of small molecules into protein binding sites. The inventor and date of ideas are tracked, as are comments and annotations from the team.

Reports from Customers

· return to TOC

Of course, the success of a software company depends on whether customers are satisfied with the products. Hence, customer reports are a key ingredient of a successful user group meeting.

Boehringer Ingelheim

· (Presentation in our Library) · return to TOC

Zhenbin (Benjamin) Li presented an overview of BI’s project to replace their MDL system. Beginning in 2008 they compared cheminformatics tools from ChemAxon, Accelrys Accord, Symx MDL and Cambridge Soft. In 2011-2012 the piloted ChemAxon, and in 2013 started migration of systems and applications to ChemAxon technology. Their selection criteria included the quality of the software, both reliability and extensibility; service, specifically consulting and customization; a favorable financial contract; and the company culture including work ethic, stability, and familiarity with the global pharmaceutical industry. They used an evolutional approach by migrating interdependent systems at the same time. Because the information system members had little exposure to ChemAxon technologies, an advisory committee of experts in chemistry informatics from different operational units provided service, consulting, training, and tutoring for IS developers. He listed a number of experiences, pain points, and opportunities for ChemAxon. His summary indicated that the migration was smooth, but that the effort to curate the data should not be underestimated as it requires engagement of both IS and business groups.

Bristol-Myers Squibb

· (Presentation in our Library) · return to TOC

Dana Vanderwall described their DARE Project, Data and Analytics for Research. The goal was to simplify the workflow for gathering and analyzing structure-activity data by using Instant JChem tied to data marts. For this they, too, used a phased approach. They started in 2012 with prototyping and finally decommissioning legacy applications at the end of 2014. The system provides automated data management in that when a user creates a new cell or table they are automatically promoted into the Instant JChem schema and Instant JChem forms. The Instant JChem datamarts access the IBM patent database, a record of high throughput mutagenesis and chiral separations, as well as drug safety and metabolite data. Currently there are 1455 forms and grids, 288 saved queries, and 474 saved lists that access 211 data trees from 2400 fields. There are generally 1,000 – 2,000 database connections a day. The migration has shifted the workload from query and discovering to alerting and reporting when new data is added to the system. The changeover was attractive to some users whereas others needed more encouragement. It was critical to maintain the familiar capabilities. He also commented on the need for the community to agree on standards for the description of biological assays.


· (Presentation in our Library) · return to TOC

Brett Heimenz reminded the audience that GSK started using ChemAxon tools across its enterprise systems to replace ISISBase in 2009, with Rollout in mid 2011, using Instant JChem. A collaboration with ChemAxon produced their new registration system. Scientists can use a web app for fix compounds that do not autoregister. GSK uses JChem to add chemistry to the GSK enterprise search engine. They provide support for desktop analysis tools, including a JChem for Spotfire plug-in developed by ChemAxon Services. Recently they have moved more effort into Plexus and are partnering with Schrödinger on LiveDesign. They plan to implement Plexus Connect in 2015.


· (Presentation in our Library) · return to TOC

Kevin Clark described their use of Document to Database to create a database of project team documents, including reviews, candidate profiles, medicinal chemistry design discussion, as well as reports on screening, in vivo tests, diagnostics and biomarkers, computational chemistry and structural biology presentations and suggestions, and publications. The documents are stored on Google Drive with searching provided by SOLR. By adding Document to Database, they were able to add chemical structure searching. Chemical structures are extracted with ChemAxon’s naming technology (including structures associated with Gnumbers); by identifying ChemDraw, SymyxDraw, or MarvinSketch structures; or by optical structure recognition with OSAR, CLiDE, or Imago. Marvin JS is used to search the JChem Cartridge, which had been migrated from Accord. So far 79K out of 84K documents have been indexed, yielding over 150K structures with 30K from Gnumbers.


· (Presentation in our Library) · return to TOC

John Kinney reported that originally their project collaborations used Lotus Notes tied to ISIS/base for structure searching. When they moved to SharePoint, they switched to JChem for SharePoint — not only does it have a built-in trawler, it also supports simultaneous searching of text and chemical structures. They also use the AES SharePoint Bridge to access Pipeline Pilot protocols including a grid-based view of structure ideas. There is now interactive reporting to support faster browsing of reports, live pop-up molecules, and tooltips as well as information on the biological testing queue and results.


· (Presentation in our Library) · return to TOC

Gregory Landrum presented their use of ChemAxon Markush capabilities as a key component of their Compound Series and Favorites, CSF, Project. It is an application that helps medicinal chemists annotate and track chemical series within their project. CSF has a web-based front end that calls out to many NIBR services; a CSF service layer that uses RDKit for structure validation; and an Oracle database with JChem and Markush extensions. A project overview presents summaries of all series, named by the project chemists, for that project. A drill-down to a series overview shows the annotated relevant individual structures as well as a series summary. A further drill-down provides information on the design of an individual compound—this is entered by the inventor of the molecule. It includes the reasoning behind the design, links to literature or biological data, notes, and tags from a controlled vocabulary. Markush features define the scaffolds for the series. Structure search returns both individual and Markush structures. CSF is integrated with Spotfire.


· return to TOC

Zhengwei Peng summarized their application that generates a vast virtual chemical space using ChemAxon Markush technology to encode the potential content of combinatorial libraries. The virtual library is useful for hit expansion, lead hopping, and idea generation. It was developed as collaboration between Merck and ChemAxon. It provides exact, sub-structure, or similarity searches. It is deployed as a service in a web or a rich GUI application and as a PLP protocol. A Pipeline Pilot workflow is used to construct and update VL content using combinatorial reaction information developed in the wet lab and automated reactant mining and filtering of reactant databases. It is stored in a pre-enumerated form. Searches return the specified number of enumerated hits as well as the library, reaction, and building blocks IDs. Their wish-list includes the usual performance enhancement so that they can go beyond their pilot database of 1010-12 structures and a framework for customer-built similarity search methods. (Presentation in our Library)

Peng also reminded the audience of the challenges and the need for solutions related to in-depth analysis of patent molecular spaces for drug discovery projects. Solving the problems would impact scientists and patent attorneys at pharma companies, patent offices, and patent content providers. Although the claimed chemical space covered by a patent can be encoded and searched today, the problem is that encoding these patents is slow and costly. Project chemists need to know if their lead compound is patentable and if the draft application Markush covers all the examples. They also need to know the competitive landscape in the particular area. Because there is no direct access to expert search tools for project chemists, these issues are typically addressed with an expert in patent searching and a patent attorney. Although ChemAxon has produced a prototype that uses Document to Structure to recognize structures in a patent and these structures can be then used to construct a Markush object, this is still a prototype. Peng made a plea for pharma companies to unite to support the goal of a patent application that serves medicinal chemists. (Presentation in our Library)


· (Presentation in our Library) · return to TOC

Xin Zhang presented their system that integrates drug design tools in a consistent visual interface. The DT workbench is a web-based application that provides computational models for use by medicinal chemists. It started as a prototype in 2008 with a substantial update in 2012 when they replaced ChemOffice with Marvin. Users interact with Marvin, which communicates with the Task Manager. The Task Manager in turn integrates Pipeline Pilot, KNIME, and script jobs. The first case study showed the use of their RS predictor that uses a program developed by Prof. Breneman’s group to predict the site of metabolism. They modified the scripts to manage the MOE and Matlab licenses and multiple simultaneous users, to allow input from MarvinSketch, and to visualize the prediction results in MarvinView. Lead hopping is accomplished via a bioisoster database with the option to view the suggestions overlaid the lead structure.

Flately Discovery

· (Presentation in our Library) · return to TOC

Jinliang Sui from Flatley Discovery Lab, a small five-year old company that focuses on treatments for cystic fibrosis, discussed their informatics environment. Since August 2010 they have screened more than a million compounds as singletons and more than 40,000 combinations. They have one compound that will start clinical trials at the end of the year and another recently added to advanced pre-clinical testing. In the early years of the company they used ChemFinder backed by a Microsoft Access database for registration, search, and compound profiling. They now use ChemCart based on an Oracle database for the same activities. For SAR analysis and visualization they use Vortex and Pipeline Pilot. The magnitude of their high throughput screening requires extensive IT support—“great informatics experts are equally important as great scientists”.

Kyoto Constella Technologies

· (Presentation in our Library) · return to TOC

Edmund Taylo described CzeekD an application for de novo compound design and optimization. The method uses a genetic algorithm to optimize structures built up from fragments generated by Recap rules on known compounds. It also supports structure substitution, addition/deletion, and bioisosteric replacements. The scoring function uses a computational model generated from compounds measured for the target activity. When the method was applied to the design of β2 adrenergic receptor ligands, 104 suggested compounds were prepared and 40 of these had IC50 values less than or equal to 30μM. For the design of V1b receptor ligands, 24 of 105 compounds had IC50 values less than or equal to 30μM. The method has also been used to design selective compounds.

Chemalytics / University of Missouri at Kansas City

· (Presentation in our Library) · return to TOC

Gerald J. Wycoff the described Chemalytics, a cloud-based system that delivers virtual screening to clients who don’t have this infrastructure. It uses residual processing power in the cloud to provide low cost solutions for academic researchers. The key element is a job queueing model that allows lower-priority agents to use otherwise wasted cycles. They use ChemAxon tools for their structure database searching and visualization. Docking is done with AutoDock. Results are presented in a spreadsheet that is linked to Jmol for 3D visualization.


· (Presentation in our Library) · return to TOC

Zane Barlow and Margaret Trombley presented “Marvin JS Goes to College” in which they described their tutorial-homework platform for organic chemistry. The application provides specific directions to the student if a wrong answer is given. They are partnering with ChemAxon to develop a custom educational version of Marvin JS. It will separate out various tools to match specific answer types. For example, the tools for drawing mechanisms will be available only if the student is asked to draw a mechanism. This custom version of Marvin JS will be offered by Pearson beginning in January 2015.

Return to Table of Contents

on components for pipelining.