For quite some time, there’s been various attempts to investigate whether Lockdown’s either work, or don’t. And as far as I know, the jury is still out on that question, at least if listening to the heated debate in media/social media….
Despite almost all of the world having implemented more or less strict Lockdown’s, many countries for very long durations, to a very high cost, there still seems to be no one able to definitely *prove* that Lockdown’s are either effective, or not, despite us now having massive amounts of data.
Personally, my belief is that Lockdowns rank very low in the set of factors that determine the outcome of COVID – there are lots of other, IMO much more important confounding factors in play that decide the outcome, most of these factors hidden, not part of the Lockdown policies & measures at all, as explained here.
Several months ago, I published a very simplistic Linear Regression on Level of Lockdown predicting deaths, clearly stating in the post that the model was only a quick & crude hack, and no deeper conclusions should be drawn from it. Unfortunately, the graph – not the full post – was spread quite widely on social media, without any of the disclaimer’s I had posted with it about its simplicity and limitations, nor what I stated about the dangers of causal inferences in general from data where there very likely are a huge amount of confounds, as this post explains.
The previous model was simplistic and thus not for drawing any conclusions, for several reasons. Firstly, it used averages for the Oxford Index and the number of deaths. Averages always lead to loss of information, and in a growing timeseries it’s even worse.
Second, there was a more general problem with the model: the Oxford Index is not a count or ‘metric’ variable, for which the ‘gap’ between subsequent numbers stays the same over the entire domain, but instead an Ordered Category. That is, even though the index is defined in the range 0..100, there is nothing saying that the ‘gaps’ between values are identical, equally large, over the entire range. For instance, is going from 1 to two the same change as going from 99 to 100….? Nor is there nothing saying that e.g. dubling some index value makes the stringency two times harder. The traditional, ‘metric’ predictor assumes that moving from one level to next has the same marginal effect on the outcome, which very plausibly is not true for the Oxford Index. Thus, it’s reasonable to expect the Oxford Index to exhibit a “ketchup effect” : first there comes nothing, then there comes almost nothing, then suddenly all of it comes…”
So, the Oxford Index is better modeled as an Ordered Categorical Variable, where the ‘gaps’ between values are allowed to be different, allowing for the marginal effect to vary between levels of the index.
As a reminder, below the now infamous graph:
So, a few days ago, I decided to take another stab at the problem, this time trying to address the shortcomings of the earlier model.
In particular, the new model looks at every single day of data since early March, (about 25000 data points to date) , for all countries having passed a given threshold of cases or deaths; it also uses ‘partial pooling’ of the data, so even if the data is conditioned on country (and day into epidemic), there is some sharing of information between the different categories, so called “shrinkage”, enabled by a hierarchical Bayesian model. Furthermore, the model, in its final step, converts the Oxford Index into an Ordered Category (with 10 levels), to address the problem with non-metric values.
So, let’s have a look at this new model and what it thinks about the association between Oxford Index and daily increment of deaths per million:
Let’s first look at the data, without any attempt to an analysis:
On the X-axis, we have the Oxford Index standardized, meaning that zero represents the average value of the dataset, and each unit is a standard deviation.
On the Y-axis we have daily increments of deaths per million, also standardized. There is HUGE variability in the dataset, the Y-values extend to more than 13 standard deviations above mean, but I’ve cut the Y-axis for clarity.
From this graph, it’s very hard to detect any reliable “signal” or pattern, it mostly looks like noise. But let’s see if we can find any pattern…. But first, let’s try to get a sense of the variability/spread of the data, by looking at the distributions of deaths per million, per category:
The spread in the data is HUGE, category 5 reaches up to almost 15 standard deviations above mean, and 7 below mean…! So there’s a lot of noise in this dataset..!
We can also see that most data points reside in predictor category 7, which corresponds to an Oxford Index value of about 2/3 standard deviation above mean.
For everything that follows, I’ve tried to take into account the fact that if a Government implements a new Lockdown Level at day n, then the effect of it, in terms of (hopefully) fewer deaths, will not be immediately visible. So deaths have been shifted forward 21 days in everything that follows. Not sure that’s the right amount of time, but I’ll go with it for now.
Let’s first run a standard Bayesian Regression, with a conventional, ‘Metric’ predictor, conditioned on both country and day of epidemic, and let’s make the model hierarchical, thus enabling it to learn from all the data, not just from each individual country, thereby enabling “Shrinkage”, hopefully avoiding over-fitting to sample:
Here we have the same data as in the first graph, but with a regression overlay. The regression line is almost flat, with a mean slope just barely positive, but with uncertainty about whether it’s positive or negative.
Thus far it looks very similar to the “infamous” simplistic graph above.
Next, let’s bin the values of Oxford Index into 10 equidistant bins and see what that looks like:
Still a pretty much flat line, but from the data itself, we can start to see something interesting: data points with Oxford Index just about the mean seem to be performing worse than either those with very low levels, or those with extreme high values on the index…! Now, the difference between bins is very small, but it is there alright.
Let’s select a set of countries with very high Oxford Index bins, and look at how many days they’ve spent at each level:
All of these 15 countries have spent the vast majority of their time at very high levels of Lockdown, in the highest 3 bins. If the previous chart gives us anything to look for, it is to have a look at countries that have spent a lot of time at extreme lockdown levels, and these 15 fulfill that criteria. So let’s run a hierarchical regression on them:
All but 3 of these countries with extreme high levels on Oxford Index, do indeed have slightly downward slopes, meaning that the association between Oxford Index and daily increment of deaths per million is negative, that is, a higher value of Oxford Index is associated with lower daily death increments (but again, remember : correlation is not causasion, at least not without a plausible causal model!)
For brevity, I will not post the graphs to show it, but suffice to say that for all other sets of index values, the dominant slope is positive, it’s only when selecting exactly the topmost exteme bins of index values that results in an overall negative association.
So, if you Lockdown extremely hard, you might actually see deaths slightly decreasing, at least for the moment, however, even among those extreme lockdown countries there is uncertainty about the direction of the effect of Lockdown.
On average however, for all 100+ countries in this dataset, the Expected effect of Lockdowns is positive, meaning that a higher level of Lockdown is associated with higher number of deaths!
One way to understand this is to imagine Locking down so hard that there is nil/zip/zero/nada social interaction at all: no one gets within a mile of any other person, nor touches anything that anyone else has touched in the past few weeks. Of course the virus will take a break in such circumstances… until these extreme measures are relaxed again, as they surely must be sooner or later (or else we are all going to starve to death).
Finally, let’s take a look at what the model thinks about the outcome and different Oxford Index bins when changing them to an Ordered Category, and then have a look at each such category’s individual importance / impact to the outcome.
Let me briefly explain the gist of the model: the basic idea with Ordered Categories has already been outlined above, but a bit more detail on how the model uses those categories might be helpful: basically, the model sees each step between category levels as an increment of Oxford Index Stringency. So, the highest Category of the 0..9 levels, nr 9, can be seen as the cumulative sum of the stringency of all lower categories, the next highest category is the cumulative sum of those that came before it etc.
Below graph shows which categories that contribute most to the overall outcome:
Above we have the 9 ‘deltas’, that is, the increments between the 10 Categories. Each plot shows the probability distribution for the effect on the overall outcome that stems from moving from the previous category to the next.
In the top left, we see the overall result, ‘bE’ (E for effect), now, after having changed from ‘metric’ predictor to an ordered categorical one, being now slightly but consistently negative, asop to all the previous models with a metric predictor, where the slope was more or less flat, with lots of uncertainty.
Next follows the 9 increments in “severity”. What’s interesting to notice here is that the first 7 increments have a very low effect on the overall outcome, 1-2%, while the last 3 together represent about 90% of the total impact…!
So, the interesting finding here is that almost all the effect of a Lockdown comes from the extreme top levels of a binned Oxford Index, that is, these three extreme levels have a very large marginal effect on the outcome (deaths), while the first 7 have almost none…!
It’s important to remember that the absolute effect never the less is still very small, as indicated by the mean of -0.15 for the bE parameter.
And that’s exactly what we saw above, when we selected 15 countries that each reside in the highest 3 extreme index categories, 7..9 : all of sudden, most of the slopes were slightly negative, while for the bulk of the data, residing at less extreme severity levels, the slopes are slightly positive or flat.
We can compare the regression predictions from the ‘metric’ vs the ordered category models, here using the means for the predicted slopes:
From this graph it’s clear that the ordered category version of the model is able to ‘pick up’ an important signal from the data, a signal that the metric predictor fails to identify.
Let’s finally look at the uncertainty stemming from using the two different predictor types, metric vs ordered categorical:
So, we see that for the metric predictor, there’s a lot of uncertainty, not just about the magnitude of the slope, but fundamentally, about its direction, whether it’s positive or negative, while for the ordered category predictor, the slope is clearly negative.
So, what if anything have we learned from this….?
On the technical side of things, we’ve seen that ordered categories must be dealt with very carefully.
Wrt the bigger question of whether Lockdown’s do anything to mitigate the virus…:
Although it might seem that the data supports that *extreme* high levels of lockdown might slightly reduce deaths, we can also see that the absolute effect is very small, and my fear is that that outcome also is very temporary – the virus will take a break, and return later, when the lockdown, as it sooner or later must be, is relaxed, thus prolonging the epidemic. We, the humans, have never before been able to ‘control’ a virus (people die every winter from the seasonal flu; HIV, which had it’s outbreak sometime in the late 80ies still does not have a vaccin etc) and I severely doubt we will be able to do so now, with COVID. The virus will do its thing, whatever we humans might think about that. We’d better learning to live with the virus, as we have done with all other viruses in history.
Meanwhile, during a hard lockdown, there is a lot of destruction of societies, economies, individuals and ultimately lives going on, and I fear that those countries that have elected going for hard lockdowns, will eventually end up in a much worse state overall than those that settled for less severe measures for coping with the virus.
A final word of warning : causality does not live within the statistical model. That is, the model, any statistical model, even though it shows an association between the predictor and the outcome, says nothing about causality. Causality does not live within the model. A statistical model can provide support for a (separate) causal model, if you have such a model, or it can refute the causal model. The important thing to always remember is “correlation does not imply causality”. So, in the above example, if we accept that the association between Oxford Index and deaths is negative, without a plausible causal model we can not say whether the reason for deaths being lower when index is higher is a true effect of change in predictor – it might be the other way around : there is no direction in a statistical model. And most importantly, the outcome might very well be caused by a factor that’s not included at all in the model, such as any of the factors in the first chart of this post….