Pharma R&D Today

Ideas and Insight supporting all stages of Drug Discovery & Development

Select category
Search this blog

Extracting Relevant Chemical Information from Patents with Machine Learning

Posted on October 5th, 2020 by in Chemistry

New chemical compounds and reactions are often introduced to the world – and with little fanfare – through patents. It may be years after the patent has been filed before these compounds are published in scholarly journals, and even then it is only a small share of them that are published at all. As a result, it can be easy for these compounds to remain unknown to researchers who may be very interested in them.

Text mining is one potentially useful way of helping bring this important chemical information to light, but unfortunately most text mining approaches don’t take the relevancy of a compound in a patent into account. This means that too much irrelevant data is extracted, therefore slowing down and complicating the search process.

However, advanced technologies like machine learning (ML) and natural language processing (NLP) have enabled the development of models that can overcome this problem and ensure the extraction of only the relevant compounds – thus making patent resources much more helpful to researchers.

Saber Akhondi, a principal NLP scientist at Elsevier, will be diving into this topic in a webinar on October 7 titled Using machine learning to extract chemical information from patents. Among the subjects that he will be discussing are chemical information extraction, the unique challenges of patent mining in the chemical domain, and how to create a quality training set for machine learning in chemistry.

If you’d like to learn more and attend this webinar, register here.

R&D Solutions for Pharma & Life Sciences

We're happy to discuss your needs and show you how Elsevier's Solution can help.

Contact Sales