What Structures are Claimed in Patents? - Use cases for patent literature MMP transformations

Posted by
András Strácz
on 26 04 2023

Medicinal chemistry transformations from patent literature

Designing and optimizing novel drugs require both creativity and knowledge. Using the Matched Molecular Pairs method is one way of supporting this process. Commonly, MMP is used to connect  structural changes of drug molecules to corresponding changes in assay readouts (Figure 1). 

Medicinal chemistry transformations from patient literature

The MMP method was used to extract all synthetically available transformations described in the patent database SureCHEMBL. Accordingly, it is possible to get an overview of how often medicinal chemists have used certain transformations, irrespective of their optimization parameters (Table 1). 


SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline, updated on a daily basis.


Data points used to create MMP set


Years of deposited patent applications

600 MB

of text

1.35 M

patent applications

20 M

exemplified compounds

1.4 M

Unique transformations


Transformation size (#atoms)

> 1000

Transformations with >1000 examples


Transformations with > 50 examples

Table 1. Data behind extracted transformations



Compound Transformation@72x-100

Figure 2. Number of occurrences and a few selected examples (orange bars) from the top 300 transformations (blue bars) in small molecule drug discovery projects,  extracted from SureCHEMBL


Use cases for patent literature MMP transformations

The MMP transformations from SureCHEMBL can be used in different ways to create analogues to a seed compound:

  1. Based on the most common transformations [2]: automatic creation of compounds that are “expected” to be made in a project – making sure you don’t forget any.

  2. Based on the least common transformations: creation of analogues that are “unexpected” – compounds a medicinal chemist would not immediately think about, but could increase the chance of creating novel compounds

These analogues can then be filtered through any additional virtual screening cascade prior to selection for synthesis (Figure 3).


Search, filter, and analyze chemical structures from in-house compound libraries or external databases and incorporate them into your MMP designs using Design Hub. Learn how to manage complex projects with ease, while collaborating with CROs.


Artboard 9@72x

Figure 3. Example of workflows applying SureCHEMBL MMP transformations for creation of Design Sets


Download the 500 most common transformations from SureCHEMBL

[1] Hussain and Rea, Journal of Chemical Information and Modeling 2010 50 (3), 339-348