Advances in text mining for chemical information: an update on the ChiKEL project.
Automated name-to-structure software can convert novel systematic names found in free text to chemical structures, enabling comprehensive substructure or similarity searches. Last year we reported on work on improving name-to-structure on scientific articles and patents, and showed an initial integration with the Linguamatics I2E text mining platform. In this talk we will describe the latest integration, and how this has been applied at scale to USPTO, WIPO and EPO full text patents. In addition to providing the latest evaluation results, we will also show examples of the benefits of being able to mix chemical structure search with linguistic processing. For example, how information can be linked across different parts of a patent e.g. between a definition of a compound and a table of results for that compound. Download slides