Chemical Structure Representation Toolkit
Canonicalization and correction of chemical structures
Summary
Adding new compounds to an existing chemical database always raises the issue of uniqueness and correctness.
When registering a new item, migrating legacy data or uploading a set of compounds to a database; the new entries have to fulfill certain requirements. Chemical structures can be represented in many different ways. These differences affect not only the graphic appearance of the molecules, but can influence more fundamental details of the topology, making compound identification even more problematic. Obviously structural mistakes and errors have to be addressed too.
Chemaxon's chemical structure representation toolkit has two major components: Standardizer, transforming chemical structures into customized, canonical representations; and Structure Checker, offering numerous checkers and fixers to search and correct various structural issues.
Meet with Us
Features
Standardizer - canonicalizing chemical structures
Standardizer's main purpose is to transform chemical structures into representations that obey certain chemical business rules to avoid inconsistencies in a chemical database. The tool is typically used in workflows where new compounds are registered, or where structure-based virtual screening is performed. There are 40 pre-defined standardizer actions available that cover among others the following issues:
- Adding or removing explicit Hydrogen atoms
- Neutralizing charged fragments or functional groups
- Recognizing and converting legacy representations of functional groups (like aliases)
- Removing certain fragments (like water and salt counterions)
- 2D cleaning and expanding abbreviated groups
- Unified representation of aromatic rings, tautomers and mesomers
Read about more Standardizer's actions. Besides the pre-defined rules, you can also implement your own ones as well.
Features
Structure Checker - correcting errors in chemical structures
Structure Checker searches molecules for structural problems. In case it finds an issue, the error in a structure is highlighted, and an instant solution is prompted. The reported problem can be fixed automatically by a built-in, prompted fixer or manually by the user. This tool is crucial in filtering out drawing errors and incorrect features when a new compound is registered. More than 40 checkers add up the system, correcting issues like:
- Invalid bond length
- Overlapping bonds or atoms
- Molecule charges
- Incorrect chiral flags
- Invalid valences
- OCR errors
- Substructure checker that can be configured to transform a given substructure defined by a SMARTS string
Read about more checkers. Structure Checker is highly customizable, so adding your own checker is also possible.
Benefits
Customizability
Toolkit components can be easily tailored to unique business needs and regulations. Both Standardizer and Structure Checker have a full featured Application Programming Interface (API) in Java and in .NET, making this solution easily integratable with in-house or third-party applications. (Reach out for Standardizer Java API, .NET API; or for Structure Checker Java API, .NET API). Meaning that, with some programming you are able to add your own standardizer actions, checkers, and fixers to your own system. These custom solutions are required usually if specific atoms, functional groups or patterns in the structures need to be removed or replaced with Standardizer; and if you have to handle chemical features that cannot be handled by the built-in checkers and fixers in Structure Checker.
Read more about custom standardizer actions and implementing them; and about implementing fixers.
Benefits
Accessibility
The Chemical Structure Representation Toolkit has individual applications for both Standardizer and Structure Checker. However, thanks to their extensive APIs, these tools are most often paired with other Chemaxon software:
- Both components play a key role in registration and chemical structure search, so no wonder that Compound Registration and the JChem Engines include the mentioned functionalities.
- Chemical database management relies on the represetation toolkit on desktop (Instant JChem, JChem for Office) and online (Plexus Suite).
- Marvin Live and Chemicalize use the toolkit's capabilities too.
- Structure checking is an important feature within our Marvin chemical drawing tool.
- Both functionality can be found in workflow tools - like KNIME and Pipeline Pilot
- The toolkit is also available from command line (Standardizer CL and Structure Checker CL).
Knowledge Hub
Resources
Learn more about Structure Representation
Documentation
Related products
Marvin
Full featured chemical editor for all platforms
Chemical Naming
Convert chemical names into structures
Markush Technology
Smart assistant for patent claim drafting and Markush analysis
Chemical Structure Representation
Standardization and correction of chemical structures
Chemicalize
Calculate properties instantly, search chemical data, and draw molecules online
Compound Registration
Normalize, check, validate and register chemical compounds
Reactor
High performance virtual synthesis engine
JChem for Office
Chemical structure handling, data analysis, visualization and reporting capabilities within MS Office
Design Hub
A single platform that connects scientific rationale, compound design and computational resources
JChem Engines
Search through tens of millions of chemical compounds and receive relevant query hits in seconds.
Calculators and Predictors
Execute high quality physico-chemical calculations and predictions.
Compliance Checker
Identify controlled substances with Compliance Checker and assign HS tariff codes with cHemTS - the easy way to comply with chemical regulations.
Discovery Tools
From clustering and diversity analysis for chemical libraries to 2D and 3D molecular screening
Instant JChem
Create, explore and share chemical data
ChemCurator
Computer-assisted chemical information extraction and analysis