Challenges in chemical literature mining

presentation · 8 years ago
by Shashikala G, Anirban Mudi, Lokanath Khamari, Jignesh Bhate, Vidyendra Sadanandan (Molecular Connections)
Mined chemical information is used to validate research results, design new synthetic routes and file patents. However, finding the chemical information of interest from millions of chemistry articles and patents poses a huge problem. Several text mining approaches have been applied to mine the chemical information in literature, but none of them is fully accurate. Lack of a universal standard for chemical structure representation and chemical nomenclature is a significant challenge in mining chemical entities. Various problems in chemical information extraction have been addressed with partial success by some commercial and academic efforts through a multifaceted approach to recognize these diverse representations and nomenclature. In this presentation we review the challenges in chemical literature mining, and examine a combination of automated and manual approaches in extracting high quality chemical information. We also discuss how manually curated data can be used to improve automated chemical literature mining techniques. Download slide