Pharma R&D Today

Ideas and Insight supporting all stages of Drug Discovery & Development

Select category
Search this blog

Where and How to Find Co-Occurrences in Disease Biology Research

Posted on July 29th, 2016 by in Chemistry


In research, a co-occurrence refers to two linked concepts appearing within one sentence. In disease biology, a good example would be a sentence with facts that link a protein to a disease, biological functions or other proteins. Finding such information is essential to developing an understanding of biological pathways in both healthy and pathological states.

As I’m sure you can imagine, reading through all of the available literature isn’t a feasible approach. The amount of literature and data is growing constantly. Fortunately, the development of advanced text-mining software is helping researchers to embrace this ever-expanding pool of literature.

Text-mining software uses intelligent search algorithms to retrieve information from text. The best-in-class solutions can be programmed to retrieve sentences with co-occurrences along with the essential contextual information about the source document, enabling researchers to decide whether they need to investigate further. Since these tools can work with full-text articles, not just with abstracts, it is possible to find all (almost, or even, all,) possible co-occurrences in a pool of millions of articles.

This capability to work with full text is very important, as shown by a recent study comparing retrieval from co-occurrences to retrieval from full-text literature. The study encompassed 23 million PubMed abstracts and 2.5 million full-text articles. Of the co-occurrences found, 57% were only found in full text; only 19% were exclusively found in abstracts.

It is clear that disease biology researchers, and indeed all scientists involved in drug development, need excellent text-mining solutions to master the volume of literature and data that is now available. Those critical co-occurrences are no longer hidden away in the volume of literature. They’re just an easily programmable search away.

More on the topic of how text-mining solutions are helping researchers deal with big data and vast pools of literature can be found in our white paper From Big Data to Drug Targets.


All opinions shared in this post are the author’s own.

R&D Solutions for Pharma & Life Sciences

We're happy to discuss your needs and show you how Elsevier's Solution can help.

Contact Sales