April 9, 2020
An R package for causal inference using Bayesian structural time-series models
What does the package do?
This R package implements an approach to estimating the causal effect of a designed intervention on a time series. For example, how many additional daily clicks were generated by an advertising campaign? Answering a question like this can be difficult when a randomized experiment is not available.
How does it work?
Given a response time series (e.g., clicks) and a set of control time series (e.g., clicks in non-affected markets or clicks on other sites), the package constructs a Bayesian structural time-series model. This model is then used to try and predict the counterfactual, i.e., how the response metric would have evolved after the intervention if the intervention had never occurred. For a quick overview, watch the tutorial video. For details, see: Brodersen et al., Annals of Applied Statistics (2015).
What assumptions does the model make?
As with all non-experimental approaches to causal inference, valid conclusions require strong assumptions. In the case of CausalImpact, we assume that there is a set control time series that were themselves not affected by the intervention. If they were, we might falsely under- or overestimate the true effect. Or we might falsely conclude that there was an effect even though in reality there wasn’t. The model also assumes that the relationship between covariates and treated time series, as established during the pre-period, remains stable throughout the post-period (see
model.args$dynamic.regression for a way of relaxing this assumption). Finally, it’s important to be aware of the priors that are part of the model (see
model.args$prior.level.sd in particular).
How is the package structured?
The package is designed to make counterfactual inference as easy as fitting a regression model, but much more powerful, provided the assumptions above are met. The package has a single entry point, the function
CausalImpact(). Given a response time series and a set of control time series, the function constructs a time-series model, performs posterior inference on the counterfactual, and returns a
CausalImpact object. The results can be summarized in terms of a table, a verbal description, or a plot.