SWEDEN : COVID DELTA is finally here – 42% increase in weekly “Cases”!

Look at that headlinescary, isn’t it ? 42% increase in cases from previous week…!

Let’s have a look:

weekly change [truncated y-axis for clarity]

Look at the top plot depicting “cases” : notice how the latest two weeks both are above 1, i.e. the weekly case count is higher than the preceding week: 23% and 42%, respectively. Surely a clear sign that the dreaded DELTA now is here….?

Perhaps not… again, it’s important to distinguish between “Relative Shark & Absolute Penguin”, that is, looking at both relative as well as absolute numbers:

Below the weekly counts of cases, ICU’s and Deaths:

Covid event counts

I don’t know about you, but to me, there’s a world of difference in conclusions between the “42% increase!” type of headline I’m sure we are now going to see in media, and the conclusions to draw from the bottom graph.

Posted in Data Analytics | 2 Comments

Ledarskap (Leadership)

Spot-on ! (Google translation to English below)

“Many of us have experienced a really bad boss. It is only with us humans that a bad leader can become a boss at all. We humans can deceive each other. In a pack of dogs, the leader is the one best suited for leadership. A dog that cannot lead does not become a leader. It is not possible in a dog’s world. The reason is simple and enviable; Dogs can not lie, manipulate and exaggerate or withhold things that may be appropriate to lie low with. A successful leader deserves the trust of the group. It can take time, just like dogs work. They prove themselves in a group, in their flock.”

Posted in Data Analytics | Tagged , , | Leave a comment

Does covid cause brain damage? – Sebastian Rushworth M.D.

A recent study purports to show that covid causes brain damage, even in those who only had mild symptoms. Here’s why the study is rubbish.
— Read on sebastianrushworth.com/2021/07/26/does-covid-cause-brain-damage/

Posted in Data Analytics | 2 Comments

SWEDEN : COVID Cases, Tests & Cases per Test

Corrcoeff’s :

Posted in Data Analytics | 1 Comment

SWEDEN : Seasonal Deaths 19/20 & 20/21 YTD – Expected, Observed, Excess

Season defined as October – September.

Deaths == All Cause Deaths.

age group cumulative excess since start of season 19/20
total montly excess over all age groups
total excess since start of season 19/20 to YTD
Posted in Data Analytics | 2 Comments

A bit more serious attempt to estimate COVID Vaccine Efficacy

A few days ago, I did a quick & dirty Bayesian estimate on Covid Vaccine Efficacy, based on Israeli data, given in a Twitter post (see details on the data in the link above).

As stated in the earlier post, there are several caveats with that analysis, the most fundamental two are:

  • Since I don’t know Hebrew, I’ve not been able to verify the data and its source (although the Twitter post does reference the source (health.gov.il). So, in what follows, I’m assuming that the numbers as given in the Twitter post are accurate.
  • The given data is only for the period of June 27:th to July 3:d, meaning that we have very little data to work with, covering only a very short period. Thus, even when assuming the given numbers are accurate, there’s a lot of uncertainty in any estimates based on that small amount of data, and the short period. How much uncertainty…? It is exactly in this type of situation – a limited amount of data – that Bayesian Inference, providing not only point estimates (combined with p-values that I’ve never really understood), but full probability distributions, can help us get a handle on what that uncertainty looks like, how much uncertainty there is.

The model I’m using in this analysis is a bit more complex than the previous one : the current model is hierarchical, using partial pooling of vax status & age group based data, the idea being that regardless of age, there is _some_ commonality between the age groups in terms of vaccine efficacy. The result of this is something called “shrinkage“, where the individual age group based estimates are “pulled” a bit towards the overall mean parameter value.

Warm-Up by a quick look first at the Pfizer Trial Vaccine Efficacy Data

(For those of you not very interested in the technicalities of Bayesian analysis, feel free to skip this section and jump straight to the section on Israeli data below)

First, let’s apply a very simple Bayesian model (*fully* pooled, like standard A/B-testing) to the Pfizer trial data, from this Lancet article, the one where I believe the 95% Efficacy numbers originate from:

My simplistic model reports an Efficacy rate of 93%, with a 89% credible interval ranging from 0.90 to 0.96. That is, very close to the numbers reported in the official trial data.

We can also run a partially pooled model on the Pfizer data for comparison. Basically, the difference is that in the partially pooled model below, the two incidence rates, test group vs control group, are not *fully* independent, instead they experience “shrinkage“, that is, a bit of ‘pull’ from each other, the less data, the more pull. Both incidence rates are therefore pulled down a bit, the group with less data pulled proportionally more.

The canonical analogy (credit to Richard McElreath) for explaining how partially pooled models work is as follows:

Imagine you are a Martian on your first visit to Earth, and you are interested in waiting times at cafe’s. Before your first visit, you have absolutely no idea on how long you’d typically have to wait for your cuppa, so your expectation on the waiting time covers a huge range. Let’s say you visit your first café in Stockholm, and you have to wait 5 min. Now, you have a bit more information on what the average waiting time might be. So you update your prior. Next, you visit a few more cafés in Stockholm, and by Bayesian updating, arrive at an expected waiting time of 3 min. Next, you travel to Oslo, and study café waiting times there. The question is: should you now reset your expectation to your initial, wide one (since you have no a priori info on the waiting times in cafés in Oslo (your experience is exclusively from Stockholm) ? That is, should you forget all about what you learned about cafe waiting times in Stockholm, now that you are in Oslo ? That is, are you subject to retrograde amnesia all of sudden, because you moved from Stockholm to Oslo ?

If you believe that the waiting times in Stockholm vs Oslo are totally independent, i.e. there’s no value in knowing the avg. waiting time in Stockholm when you are in Oslo, then, you should pool your model fully, that is, treat Oslo cafés as a totally different species than Stockholm cafés. On the other hand, if you believe that waiting times in Starbuck’s Stockholm and Starbuck’s Oslo probably are not that dissimilar, then you want to model the waiting time as a partially pooled one, that is, using some of what you learned in Stockholm when studying Oslo, that is, the data from Stockholm has some influence on your Oslo prediction.

So, below the Pfizer data modeled with a partial pooling model:

Using the partial pooling model, the efficacy is a smitheren higher, 94% instead of 93.

We can see the shrinkage clearly by looking at the descriptive stats for the two models (fully pooled vs. partially pooled):

The above table shows the two models, “Full Pool” vs “Partial Pool”, and three desc.stats for each: mean, HDI low, HDI high (for a 89% Credible interval). We have two parameters of interest: alpha[0], which is the incidence rate for the control group (non-vaccinated), and alpha[1] which is the incidence rate for the test group (vaccinated). These params are on a log-odds scale.

The two p_alpha parameters are derived from the log-odds alphas, and show the incidence on a normal 0-1 probability scale for the two cohorts. Efficacy is on normal probability scale, and the parameter of our main interest. Finally, alpha_bar is the log-odds hyperparam for alpha. alpha_bar makes the model hierarchical, partially pooled and enables shrinkage, and p_alpha_bar is its representation on a normal probability scale.

If we focus on the two p_alphas, we can see that both of them gets a tiny bit smaller when going from fully pooled to partially pooled model. That is, both of them is pulled towards a common mean (p_alpha_bar). This is shrinkage in action, enabled by the hyperparam alpha_bar, a common “ancestor” to both alphas.

Finally, for this rather technical section, we can take a look at a forest plot showing the difference between the fully pooled (red) and partially-pooled (orange) models :

For a good intro to hierachical models, pooling and shrinkage, take a look at this blog post.

Now, let’s look at the Israeli data:

For reference, here’s a dataframe with the data from that Twitter post, combined with point estimates on incidence and efficacy (assuming a population size for each age group) :

Let’s next run the new, hierarchical model on this data, and first look at the distributions for incidence rates, non-vaxed (green) vs vaxed (red):

A couple of observations to make:

  • notice that the credible intervals (~~confidence intervals for the Frequentists among us 🙂 are much wider for the non-vaccinated vs the vaccinated. Why…? Because there’s much more data on the vaccinated, since most Israelis are now vaccinated, thus the model is more certain on the vaccinated.
  • notice that for the cohort with the least amount of data, the non-vaxed 80-89 year olds, the incidence rate (as given by the round marker) has been pulled up from it’s calculated value 0.03 from the dataframe above, to 0.07. Shrinkage in action!

Next, lets look at the Vaccine Efficacy distributions, per age group:

Again, a couple of things to notice:

  • Vaccine Efficacy calculated on this (very limited!) data is far from the 95% of the pre-release trials.
  • The uncertainty is huge: the credible interval for all age groups except 80-89 spans over zero
  • Age group 80-89 has been pulled upwards by shrinkage quite a lot. With the very low calculated incidence rates for this group, combined with shrinkage, that is, the non-vax incidence rate being pulled up from 0.03 to 0.07 (which is still way lower than the rates for the rest of the npn-vax groups), Efficacy for this group ends up at about 50%, however, with a wide credible interval.

We can also look at a forest plot of the same data:


So, what conclusions regarding Vaccine Efficacy would I draw from this analysis…?

The short answer is: almost none. First, because I don’t have a clue regarding the quality of the data. Secondly, because there’s very little data. And thirdly, because the analysis makes it clear that there’s a lot of uncertainty in the results. But that last point is actually a valuable finding : I’ve seen a lot of Twitter posts referencing this dataset and drawing full blown conclusions from it, but with a bit of Bayesian analysis, we can now appreciate that before jumping to any conclusions, we need more data, and we need to confirm data quality.

Posted in A/B Testing, Bayes, Data Analytics, MCMC, PYMC | Tagged , , , | Leave a comment

SWEDEN All Cause Mortality Update incl. June 2021

Monthly age group mortality timeline, 2015 – June 2021:

2021 cumulative daily age group mortality:

Monthly cumulative age group deaths, 2015-2021 : observed vs. expected:

Monthly cumulative age group excess deaths, 2015-2021 :

Total cumulative monthly observed vs expected age group deaths:

Monthly age group Excess deaths:

Cumulative monthly age group Excess Deaths:

Cumulative monthly total Excess Deaths:

Posted in Data Analytics | 2 Comments

COVID Vaccine Efficacy – uncertainty reigns

Do you know the vaccine efficacy numbers cited for COVID vaccines, the ones coming from the pre-release trials…? I’ve seen numbers in the 65-95% ball park, as e.g. in this Lancet article which is IMO better understood by reading this blog post.

A few days ago I noticed the below data from Israel, on Covid cases among vaccinated vs non-vaccinated:

Simply by eye-balling the top table, we see that most COVID cases now occur among the vaccinated. By a huge margin. For each age group.

That doesn’t look good…! But there are some caveats: first, look at the column “Percent of Population Vaccinated” : the vast majority of people in Israel are now vaccinated, so unless we expect the vaccines to be 100 % efficient (which I in fact myself believed vaccines being, before COVID vaccines came into picture…) in blocking the virus, chances are that most of the infected will indeed come from the larger cohort (those vaccinated). Secondly, the table does not provide the group sizes, so we cant calculate the incidence rates for the vaccinated vs non-vaccinated. Thirdly, the absolute number of cases (relative to the unknown group sizes) is very small, so the uncertainty with this few data points on cases, particularly for the non-vaccinated, is huge.

But we can still do a back-of-the-envelope type of calculation on vaccine efficacy – as long as we keep in mind the uncertainty coming from the small numbers – by assuming the group population size, and I’m going to be lazy and set it to 1M per age group.

With a bit of arithmetic, we get to:

incidence & efficacy based on assumed population size

Efficacy is given by the rightmost column. It’s way below the 65-95% range given e.g. by the Lancet article mentioned above. However, let’s not forget that this was calculated with very little data, so we should be very careful drawing any conclusions from this data, until we have more data on cases.

One way to see how certain / uncertain these numbers are can be obtained by running a Bayesian analysis, to obtain not only point estimates (as above), but full probability distributions for the efficacy rates for the various age groups.

I did a quick & dirty version of such an analysis for 7 of the 8 age groups above (the 90+ group gets an infinite negative efficacy since there are 0 cases in that group within the non-vaccinated):

One way to understand the uncertainty is to look at the 89% Credible Interval, given by the black horizontal bar: it crosses zero for all age groups, meaning that there’s some probability (density) on both sides of vaccine efficacy being positive or negative. And looking at the last graph, the one for 80-89 year olds, where we have only 23 cases for vaccinated, and 2 for non-vaccinated, the credible interval is almost perfectly balanced around zero, meaning that we should probably not put much trust in the efficacy number for that age group.

Nevertheless, it seems that vaccine efficacy in Israel now, when most Israelis are vaccinated, does not reach even close to the 65-95% range given by the pre-release testing. Far from it.

UPDATE: here’s similar findings based on UK data https://lockdownsceptics.org/2021/07/17/infections-in-the-vaccinated-overtake-those-in-the-unvaccinated-for-the-first-time-but-the-graph-mysteriously-disappears-from-the-zoe-app-report/

Posted in Data Analytics | 2 Comments

Winter viruses and COVID-19 could push NHS to breaking point, warns new report | Imperial News | Imperial College London

WINTER VIRUSES – COVID-19, influenza, and the respiratory virus Respiratory Syncytial Virus (RSV), could push the NHS to breaking point this winter, a new report says.
— Read on www.imperial.ac.uk/news/226493/winter-viruses-covid-19-could-push-nhs/amp/

“Between 15,000 and 60,000 people could die from influenza this winter according to new modelling for the report, though the planned widespread flu vaccination should help to reduce this risk.”

The next wave of Lockdowns surely coming up soon…. Caused not by COVID but by Our Old Companion : Seasonal Flu.

Posted in Data Analytics | Leave a comment

SWEDEN : COVID Events Feb 2020 – July 11, 2021 (Delta variant now at 40%)

per week of year
impact per age grp

Finally, those interested might want to do a google translate of Folkhälsomyndighetens latest ( July 8:th) report on the prevalence of the dreaded DELTA variant in Sweden

Linjediagram som visar andelen av fall som är alfa, beta, gamma, delta respektive övriga varianter per vecka. Diagrammet visar även antalet fall totalt sett. Observera att dataunderlaget är ofullständigt för vecka 24 och 25 och kommer att justeras i efterhand allteftersom ytterligare resultat rapporteras in.
DELTA Variant (DataSource : https://www.folkhalsomyndigheten.se/nyheter-och-press/nyhetsarkiv/2021/juli/deltavarianten-fortsatter-att-oka-sin-andel-i-sverige/)
Posted in Data Analytics | Leave a comment