Pharma R&D Today

Ideas and Insight supporting all stages of Drug Discovery & Development

Select category
Search this blog

Big Data and Pharmaceutical Drug Discovery

Posted on August 4th, 2016 by in Pharma R&D


One of the most heard buzzwords nowadays is “Big Data.” Big data is defined as the application of large amounts of data for extracting insights and understanding patterns, to drive informed decision-making. You get to see this everywhere starting with commercials on TV on IBM Watson, application of big data is main topic of many conferences where big data is discussed for predictive modeling in every sector, including that in the pharma sector.

Although the term “big data” is relatively new, some industries have been using it for over a decade. Marketing departments continuously harness demographic and socioeconomic data to help target prospective customers. Banks use it to detect fraud on credit cards. Meanwhile, energy companies collect data to understand and predict consumption, to help enable efficient allocation of the grid. And today, in 2016, the increased volume of data being collected offers even greater potential. What better place to make a foray into big data than big pharma?

Problem and Opportunity

It’s no secret the pharmaceutical Research & Development sector needs more tools which can reduce discovery and development costs. According to one study, the average cost of bringing a drug to market is as high as one to $2.6 billion (1). Contrast this to the software industry, where the average cost of bringing a mobile app to market is not even than one million dollars (2).

So how does one reduce or cut costs? The company must first identify what’s driving up costs in the first place. If you were a marketing company looking to save costs, you would ideally invest resources in a target market of people most likely to convert, rather than the entire population. Translating the same concept to drug discovery, it is in a company’s best interests to only invest in compounds likely to succeed in Phase I through Phase IV clinical trials and in the long-term doing so could potentially earn a company to recover the R & D investments that would otherwise be lost in late-stage clinical failures.

So how can a pharmaceutical company accomplish this and predict the future success of their NCEs or NBEs? A widespread method which has been used is physiologically-based pharmokinetic modeling. This method can predict how the compound will act in the body using advanced mathematical modeling and simulations. Meanwhile, another useful and simple approach is the collection and application of different types of historical data from throughout the drug development process. This data could come from pre-clinical studies, clinical trials, and/or post-marketing surveillance including pharmacovigilance. The data can then be used to build predictive models, predicting the final outcome (FDA approval/patient outcomes) based on a variety of independent variables.

Some pharmaceutical companies and researchers are already beginning to do this. In one such study, the Tufts Center for the Study of Drug Development and Janssen Research & Development utilized an algorithm which identified key early-stage factors associated with the post-marketing success of drugs. Specifically, the variables found most predictive included study sample size, genomic-molecular basis of the Mechanism of Action (MOA), anti-tumor activity, and randomized study data (3). (4). This algorithm enabled a 92% success rate in accurately predicting drug candidates, based on their high predictive score. Also, researchers from Google and Stanford University released a paper highlighting the potential for deep learning, a branch of machine learning, to help identify high-value compounds in drug discovery through the analysis of large quantities of biological and clinical data (5). Although this type of research is still in its preliminary stages, it is something to look out for.

Since pharmaceutical data is silo-ed, and because it’s so valuable, pharmaceutical companies are beginning to recognize the importance of purchasing analytic services from outside sources to drive their drug discovery processes. In 2014, a Silicon Valley-based company called twoXAR began offering analytic solutions to pharmaceutical companies, inherently identifying potential drug candidates through the analysis of large biomedical datasets. In addition, the software company Schrodinger provides predictive insights on compounds by harnessing the power of computational models. Finally, companies such as Cyclica offer predictive analytics software, which use large datasets of proteomic target ligand information to predict how compounds will act in the body.

It’s very clear that big data has arrived to the industry of big pharma. However, big data in pharma will likely not be a “one-size-fits-all” approach. There are many types of big data applications available and on the horizon; therefore, it will be up to the individual company to determine which application makes the most sense to them to help meet their pipeline and business needs. Ultimately the hope is that these ‘in-silico’ models can predict the NCE/NBE which would ultimately ‘short-list’ the compounds for further clinical development, with an ultimate hope that the compound survives through the ‘Safety/ Efficacy’ screens, as predicted by these ‘big data’ models deduced through all the available information which has gone into creating such a huge database. This should ultimately help significantly in cutting the ‘sunk-in’ costs for Drug Discovery efforts.


All opinions shared in this post are the author’s own.

If you would like to find out more about resolving challenges with data quality and integration, find out how Professional Services can help enable innovative discoveries and more informed decisions.

R&D Solutions for Pharma & Life Sciences

We're happy to discuss your needs and show you how Elsevier's Solution can help.

Contact Sales