Pharma R&D Today

Ideas and Insight supporting all stages of Drug Discovery & Development

Select category
Search this blog

Mathematical modeling the emergence and spread of new pathogens: Insight for SARS-CoV-2 and other similar viruses

Posted on April 24th, 2020 by in COVID-19

Knowing if and how rapidly an emerging pathogen will spread through a population enables public health officials to make well-informed decisions to protect the public. Mathematical modeling can provide them this means to predict pathogen spread, but modeling previously unheard of pathogens, like severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is challenging.

Typically, mathematical modeling requires researchers to acquire at least one dataset with the relevant data points to develop the model, and another similar dataset to validate the model. For emerging diseases like novel coronavirus disease 2019 (COVID-19), for which we did not have a readily available diagnostic kit to distinguish SARS-CoV-2–positive cases from negative cases, validity and completeness of the data is often problematic.

Mathematical modeling strategies

Designing a model involves researchers making presumptions about which parameters influence pathogen transmission the most and thus which variables should be included in modeling and need inferences made about them [1,2]. Researchers designing models may assume all people are equally susceptible to the pathogen and equal mixing of the population, along with some other assumptions such as infinite population size (to avoid modeling births and deaths) to make modeling easier. However, many researchers idealize realism-type approaches and value the addition of more parameters.

Modelers may incorporate into their model variables for population age structure and growth, different infection susceptibilities based on age (or another factor), and social networking patterns. However, the more parameters included, the more mathematically complex the model becomes. Complex models might be more realistic, but they are often not better. With larger numbers of variables, missing or inaccurate data points have more influence over model results, and longer time periods are needed to compute outcomes. Also, sometimes variables seem important but have little influence over model results.

Models developed for pathogens during their emergence are often inaccurate [3]. Typically, when incomplete datasets are used to develop models, multiple different models appear capable of fitting the existing data points but predict different outcomes.

COVID-19 findings

For the COVID-19 pandemic, mathematical modeling has been used to estimate a few aspects relating to pathogen spread, such as the basic reproductive number (R0, number of secondary infections caused by 1 infection in a completely susceptible population) for SARS-CoV-2 (R0 2.8-4.0) [4] and the percentage of people with asymptomatic infections (~17.9%) [5]. Some modeling studies have shown population control measures did decrease pathogen spread [6,7], and Kucharski et al. found that four independent SARS-CoV-2 introduction events into environments mimicking Wuhan, China would provide >50% chance of virus establishment in that population [7]. As of March 29, 2020, no studies published in scientific journals have shown predictions on the extent of pathogen spread globally.

Modeling spread of pandemic

In future efforts to model pandemic spread of SARS-CoV-2, I would suggest testing a model designed against another RNA virus (or a virus with a similar mutation potential) that had an established surveillance system ongoing (relatively valid data set) and caused a respiratory disease (similar transmission capability) in a population that was arguably 100% susceptible: maybe the H5N1 or H1N1 pandemic strains. One could perhaps take a model designed to predict pandemic spread of a somewhat similar pathogen and plug-in another dataset. If the model predicts COVID-19 spread, perhaps we can use this model with the next respiratory disease pandemic.


1. Funk S, King AA. Choices and trade-offs in inference with infectious disease models. Epidemics. 2019 Dec 20;30:100383. doi: 10.1016/j.epidem.2019.100383. [Epub ahead of print]

2. Siettos CI, Russo L. Mathematical modeling of infectious disease dynamics. Virulence. 2013 May 15;4(4):295-306. doi: 10.4161/viru.24041. Epub 2013 Apr 3.

3. Pellis L, Cauchemez S, Ferguson NM, Fraser C. Systematic selection between age and household structure for models aimed at emerging epidemic predictions. Nat Commun. 2020 Feb 14;11(1):906. doi: 10.1038/s41467-019-14229-4.

4. Zhou T, Liu Q, Yang Z, Liao J, Yang K, Bai W, Lu X, Zhang W. Preliminary prediction of the basic reproduction number of the Wuhan novel coronavirus 2019-nCoV. J Evid Based Med. 2020 Feb;13(1):3-7. doi: 10.1111/jebm.12376.

5. Mizumoto K, Kagaya K, Zarebski A, Chowell G. Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020. Euro Surveill. 2020 Mar;25(10). doi: 10.2807/1560-7917.ES.2020.25.10.2000180.

6. Kraemer MUG, Yang CH, Gutierrez B, et al; Open COVID-19 Data Working Group. The effect of human mobility and control measures on the COVID-19 epidemic in China. Science. 2020 Mar 25. pii: eabb4218. doi: 10.1126/science.abb4218. [Epub ahead of print]

7. Kucharski AJ, Russell TW, Diamond C, et al; Centre for Mathematical Modelling of Infectious Diseases COVID-19 working group. Early dynamics of transmission and control of COVID-19: a mathematical modelling study. Lancet Infect Dis. 2020 Mar 11. pii: S1473-3099(20)30144-4. doi: 10.1016/S1473-3099(20)30144-4. [Epub ahead of print] Erratum in: Lancet Infect Dis. 2020 Mar 25.

R&D Solutions for Pharma & Life Sciences

We're happy to discuss your needs and show you how Elsevier's Solution can help.

Contact Sales