AI Modeling in the time of Covid-19

Posted: 2020-04-14

Author: Admin

Companies of all sizes are facing unprecedented uncertainty and challenges due to the global impacts of COVID-19. It has created a major systemic shock to the economy: steep market declines, substantial increase in the volatility of all major markets, federal interest rate cuts and government stimulus, and record levels of unemployment claims. Supply chains and distribution networks have been significantly disrupted, as employees and customers are forced to stay at home.

In a time with such radical upheaval, are current forecasting and customer behavior models still accurate? Let’s examine the effectiveness of the models, determine whether they can account for the new normal, and embrace the best practices to improve the accuracy of prediction.

Models are supposed to work at all points throughout the business cycle—not only in good times, but also in a crisis. Unfortunately, this did not happen during the financial crisis of 2007-08.

One model that was widely used to price and manage the risk of Collateralized Debt Obligations (CDOs) and other securitization products was known as the Gaussian copula function. While this model is mathematically “beautiful,” it’s fatally flawed, and the failure to heed warnings about its limitations contributed to a colossal bubble in the securitization market in 2007-08, leading to losses of trillions of dollars.

Regarding subprime loans—which were at the epicenter of the crisis—only a few years of data were available, and there weren’t sufficient data quality controls in place (an unfortunate example of “garbage in, garbage out”). In general, models were not sufficiently tested through-the-cycle. After the crisis, policymakers and regulators pursued a series of regulatory reforms aimed at managing model risk and forecasting losses in stressful times, increasing required levels of equity capital in order to enhance the loss-absorbing capacity of the banks.

Companies need models that look back far enough to include historical catastrophic events (which may influence risk posture, liquidity, supply chain, or customer behavior) and that capture the current and anticipated impacts on their business caused by company-specific, industry-specific, and macro-economic factors. Models that don’t capture the exogenous factors with a sufficient level of granularity will fail to predict the forecast accurately. For companies that don’t have sufficient internal data encompassing several business cycles, historical shocks can be simulated to represent the current situation leveraging relevant external datasets (from third-party vendor/data consortiums/economic data provided by regulators, for example FRED/GeoFRED, ABI, ORX, ORIC International etc.) and models can be retrained. Capturing the dependence relationships with greater precision, taking into account idiosyncrasies of multivariate distributions (especially with respect to the behavior in the extreme tails), helps to improve the model’s performance. The best models incorporate as much history as possible, while also appropriately weighting more recent history to capture current trends.

While extreme events such as COVID-19 are rare and unpredictable, they still play a critical role in forecasting and customer behavior models. Typically, models fail to adequately capture the risk of extreme events; humans tend to focus on what is reasonably possible at the expense of what is remotely probable (i.e., tail distribution). Training and test datasets containing (for example) 99% positive and only 1% adverse events can lead to data imbalance; since tail events are sparse, data samples must be sufficiently representative using data augmentation techniques to minimize this bias.

Thus, accurate characterization of extreme events is critical for understanding the trends and potential impact of such events. AI/ML models may need enhancements to account for the possibility of significant change and avoid any potential algorithm bias. Extreme value distribution can be embedded within the modeling framework for detecting Black Swan events—such as COVID-19—to capture the tail distribution of observations.

How then can we ensure our models accurately account for unprecedented events? Here are some best practices that companies can adopt for models that work—even in times of crisis:

  • Maintain a library of forward-looking scenarios based on historical and hypothetical events that are exceptional but plausible to predict the revenue, losses, delinquency, spend behavior of the borrowers, etc.
  • Ensure these scenarios reflect macroeconomic/financial conditions that are tailored specifically to stress the organization’s key vulnerabilities and idiosyncratic risks.
  • Retrain models with the most recent data available to capture the trends.
  • Adjust assumptions to the specific scenarios; model assumptions should be reasonable, supportable, reviewed at regular intervals, and documented in a well-thought-out process.
  • Identify the data drift, model decay proactively, and recalibrate models in a timely manner.
  • Conduct frequent outcome analysis and back-testing, as/when the ground truth is available.
  • Keep the humans in the loop, adjusting the model-based estimates based on careful review and expert judgment to ensure the results make sense.


For ongoing successful usage of the models, organizations must continue to rigorously assess data, maintain conceptual soundness of the methodology, continuously monitor model performance, and ensure a strong understanding of the context. With these best practices, effectiveness and credibility will be enhanced, creating models that work—even in uncertain times like these.

About the Author:

Raj Gangavarapu is Head of Data Science at diwo, your intelligent advisor to turn AI into action. He has two decades of leadership experience helping companies to solve complex business problems by leveraging data and analytics. He is a speaker at various academic and industry conferences on data science, risk, Artificial Intelligence (AI) and Machine Learning (ML).