Find chemistry in unstructured data

Discover the hidden chemical knowledge in documents, regardless whether they are located on local computer, network share or in the Cloud (Google Drive, OneDrive, DropBox, SharePoint Online, Office 365). Find faster, respond faster, decide faster. ChemLocator is a web-based search tool with the capability of using it on personal desktop machines.

ChemLocator - Easy start

ChemLocator's wizard-based configuration approach ensures that a fully installed and set-up application is ready for use within just a few minutes. After download and installation, a Setup Wizard will bring you through the initial steps where you can locate the license for example or change the system resource control settings if necessary.

ChemLocator workflow

Chemical search in cloud drive

Most companies have plenty of documents stored in the cloud, where chemical search can be rather challenging. ChemLocator can connect to Google Drive, OneDrive, DropBox, SharePoint on-premise and SharePoint Online and make them chemically searchable, without downloading anything from the cloud. You can set your personal cloud repository, but if ChemLocator is running on a server, you can define business accounts as well.

Chemical search combined with free text and metadata conditions

Free text search solutions are widely available on the market, but there are only a few engines which can combine free text search with chemical search. ChemLocator can do this relying on ChemAxon's robust JChem Engines and on Elastic search. Not only free text criterias are possible to use but also the most important metadata of the documents are searchable, like author, creation and modification dates, etc.

To start a search, you can either draw a molecule, or type a chemical name into the name to structure converter. ChemLocator recognizes SMILES, InChI, CAS numbers and chemical formulas. After defining the query, you can pick what kind of structure based chemical search you want to run. Substructure, similarity, full substructure, duplicate, superstructure and full fragment search is available in ChemLocator. This chemical search can be extended with free text search, where you can add any kind of text to narrow down your results.

Result view of a combined chemical and free text search query in ChemLocator

Search based on chemical properties and terms

In some cases a plain chemical structure search is not enough and you also may want to filter the data before search. ChemLocator's chemical property filter will help you finding those structures that fall between a certain range of molecular weight, logP value, atom or bond count. You can achieve even more with the Chemical Terms based filters. ChemLocator has a set of pre-defined chemical terms, though you can run really advanced search beyond those examples. After the original search is done, it's possible to refine the hits based on lowest to highest with an easy to use slider approach that helps set the ranges.

Filtering hits according to the Lipinski's Rule of Five with Chemical Terms in ChemLocator

Refine your search hits

At the start of a research, it's not always clear what information should be discovered precisely. One solution for this issue is to start with more general query conditions. With this approach there are fair chances that the hit list will be rather large. With the refiners, ChemLocator can help you narrowing down the hit list without executing new queries against the full database. The refiners will operate only on the already available hit list and not on the whole database. ChemLocator offers various refiners, some of them are content related like structure refiners. Others are metadata refiners like file type, author, creation date, etc. And some refiners are able to filter out locations of documents, and many more.

Relevance sorting

Some queries might give hundreds of hits back using the substructure search. Finding the most relevant ones can be difficult. The results can be ordered by the atom count, but ChemLocator offers a more sophisticated solution, called relevance sorting. It ensures that the user will see those hits first that are the most relevant to the query. Using the relevance sorting formula will not affect the search response time. Achieve more relevant hits first, without performance penalty!

Options to refine search results in ChemLocator

Optical Character Recognition and Optical Structure Recognition

Most of the companies face the problem of searching chemical information within legacy data lying around in hard copies, printed papers. Some of this information is digitized, some will be digitized rather soon. Scanned documents with JPG, PNG, GIF, TIFF, PDF or other graphical file formats are hard to search. ChemLocator supports a wide variety of optical structure recognition (OSR) applications such as CLiDE or OSRA, and has a character recognition solution built in. This way ChemLocator can index and search the chemical data in image files too.

Results of a search with optical structure recognition in ChemLocator

Compound ID connection

Using registration systems and chemical databases made compound IDs widespread. Identifiers are often used to refer to a certain molecule in reports, documents and other internal material. Any free text search engine is capable to find the IDs in the documents, but ChemLocator can find these identifiers even in those cases when a structure name or a synonym was given as a search query. Compound IDs can also be converted to structures during the query on ChemLocator's search screen.

Corporate identifiers