News

Extreme Search Speed-Ups in JChem PostgreSQL Cartridge 2.7

Posted by

on 2017-09-03

News

Extreme Search Speed-Ups in JChem PostgreSQL Cartridge 2.7

New index type: sortedchemindex
Massive speed-up in duplicate and similarity searches using sortedchemindex
Extra speed-ups can be achieved in substructure search with top hits
Up to 60 times speed-up in case of typical joined queries

Results of duplicate (DUP) and similarity (SIM) search benchmarks in JChem Oracle Cartridge (JOC) version 17.3.6, JChem PostgreSQL Cartridge (JPC) 2.4 and JPC 2.7 (see technical details in footnote¹). All tables are indexed, and JPC 2.7 uses the new sortedchemindex type (by using this type of index most similar hits are displayed first).

Please note that in case of these search types the speed is more-or-less query independent.

Substructure search benchmarks are run on “rare” and “frequent”² query sets where the hits are ordered by relevance³.

In case of many hits it may be worth retrieving only the first hits (top 500 in the benchmark).

Joined queries⁴ can also speed up, depending on the decision of the PostgreSQL execution planner.

Click to the demo site and try this out now.

Footnotes

^{1. [Target set: 8M structures, PubChem, Query set: small fragments and druglike molecules, Similarity search: retrieved only the 100 most similar structures]↩} ^{2. [rare: few hits; frequent: many possible hits after screening phase, many returned hits]↩} ^{3. [ Starting from JPC 2.7, the result set can be ordered directly by the chemical structures - most relevant hits come first.]↩} ^{4. [Benchmark queries: JOC - select count() from pbch_8m where jc_compare(mol, 'Clc1ccccc1', 't:s') = 1 and molweight < 120; JPC - select count() from pbch_8m where 'Clc1ccccc1' |<| mol and molweight < 120;]↩}

Facebook Twitter LinkedIn

Copy to clipboard Copy link

New index type: sortedchemindex
Massive speed-up in duplicate and similarity searches using sortedchemindex
Extra speed-ups can be achieved in substructure search with top hits
Up to 60 times speed-up in case of typical joined queries

Please note that in case of these search types the speed is more-or-less query independent.

Substructure search benchmarks are run on “rare” and “frequent”² query sets where the hits are ordered by relevance³.

In case of many hits it may be worth retrieving only the first hits (top 500 in the benchmark).

Joined queries⁴ can also speed up, depending on the decision of the PostgreSQL execution planner.

Click to the demo site and try this out now.

Marvin

The new Marvin is a universal chemical editor that serves the needs of any chemist involved in research and drug discovery.

Design Hub

Your molecular design and tracking platform turning drug discovery into a team sport.

Compound Registration

Compound Registration compares the uniqueness of new small molecules against those already stored in your database.

Design Hub

Extreme Search Speed-Ups in JChem PostgreSQL Cartridge 2.7

Extreme Search Speed-Ups in JChem PostgreSQL Cartridge 2.7

Footnotes

Footnotes

Related content

Information on CVE-2024-52046

Certara Completes Acquisition of Chemaxon

Certara to Acquire Chemaxon to Strengthen Drug Discovery Software Portfolio

"Make the company look like I won the lottery"