Pharma R&D Today
Ideas and Insight supporting all stages of Drug Discovery & Development
Extracting Relevant Chemical Information from Patents with Machine Learning
Posted on October 5th, 2020 by Xuanyan Xu in Chemistry
New chemical compounds and reactions are often introduced to the world – and with little fanfare – through patents. It may be years after the patent has been filed before these compounds are published in scholarly journals, and even then it is only a small share of them that are published at all. As a result, it can be easy for these compounds to remain unknown to researchers who may be very interested in them.
Text mining is one potentially useful way of helping bring this important chemical information to light, but unfortunately most text mining approaches don’t take the relevancy of a compound in a patent into account. This means that too much irrelevant data is extracted, therefore slowing down and complicating the search process.
However, advanced technologies like machine learning (ML) and natural language processing (NLP) have enabled the development of models that can overcome this problem and ensure the extraction of only the relevant compounds – thus making patent resources much more helpful to researchers.
Saber Akhondi, a principal NLP scientist at Elsevier, will be diving into this topic in a webinar on October 7 titled Using machine learning to extract chemical information from patents. Among the subjects that he will be discussing are chemical information extraction, the unique challenges of patent mining in the chemical domain, and how to create a quality training set for machine learning in chemistry.
If you’d like to learn more and attend this webinar, register here.
R&D Solutions for Pharma & Life SciencesWe're happy to discuss your needs and show you how Elsevier's Solution can help.
Sr. Marketing Manager, Life Sciences Audience at Elsevier
- How DNA-encoded Libraries Boost Drug Discovery
- Complimentary trend analysis report: Emerging trends in pancreatic cancer research
- BCG-COVID 19 Hackathon: Task 1 winners announced, Task 2 – data scientists – we need you!
- To Be a Digital Pharma Player, You Need Data – Reusable Data
- Treating Pharma Data as an Asset: moving from an application centric to an information-centric organization – presented by Dr. Martin Romacker, Roche