Scientific gambling – How to identify potentially profitable odds/plays ?

In all sports gambling, success or failure is determined by a number of factors, luck not being the least of them, since in any sport there are loads of “Unknown Unknowns“, which we could also call “Uncertainty”. And then there is randomness.

If we compare sports betting with casino type betting, in casino betting we can talk about “Risk” instead of “Uncertainty”, since for the casino games like roulette etc we can mathematically calculate all the probabilities involved. However, in sports betting, no matter how much data we have, we have no way to mathematically decide the exact odds, because there are many Unknown Unknowns involved in each game. Thus, sports betting is a truly probabilistic endeavor, fully dominated by uncertainty, asop to calculable risk.

But, despite the presence of the Unknown Unknowns, we can attempt to do the best we can, by utilizing data that we have available, such as rankings, historical results etc. This is what my Bayesian Inference Engine does, basically calculating the most likely outcomes for any game, given the rankings and historical results (as well as the statistical model I’ve given it).

So, let’s take a concrete example of how I decide upon which games to bet on:

First, I execute my Bayesian Engine on the rankings and historical game data that I’ve gathered, about 1100 world championship games since year 2000 to date. Basically, what the engine does is to do millions on simulations, trying to come up with model parameters that generate the same results as in the historical games. Once these parameters are known, they are used to predict the outcomes of new games.  Such a run can take anything from a few hours to days (or even weeks) depending on how many games the model is predicting, how much data there is, and how “deep” I’ve told it to go in its analysis. A typical run covering each days 4-6 games takes about 10 hours of cpu-time on my machine.

When the Bayesian Inference in the above step is done – typically run overnight – it’s time to analyze the predictions, and to compare the predictions with the predictions of a commercial betting shop, in my case, Unibet, trying to identify games where my program has found an advantage in the odds setting.

The first thing I do, is to check for any games where the probabilities between Unibet’s predictions and my predictions differ:


Instead of looking at de massive amount of numbers in rows and columns, the above graph gives me a quick overview on where, according to my program, there might be an odds setting that is advantageous to me: basically, here I look for the different colored sub-bars, and compare their size. For instance, here, I can see that the game RUS-FRA, my program has identified the probability of a draw quite a bit higher than Unibet, thus, that might be a candidate for placing a bet. Let’s have a closer look:


Above graph shows the differences in probabilities between Unibet’s predictions and my predictions. The one’s that are potentially interesting, are the one’s on the plus-side, i.e. where my program has found a higher probability than Unibet.  RUS-FRA is interesting, as well as SWE-BLR, and SUI-AUT. Let’s dig deeper, using RUS-FRA as our example:


Here we can see that given my program’s prediction of the outcome of RUS-FRA, vs. the prediction on which Unibet bases its odds, there is an opportunity here to make money  – IFF MY PROGRAMS PREDICTION HAPPENS TO BE CORRECT AND THE SMALL MIRACLE OF FRA WINNING HAPPENS! – since according to my program, the probability for a draw as my program sees it, is higher than how Unibet sees it, and therefore, they have set the odds for a draw higher than they should be, as my prediction sees the outcome of that game.

Of course, RUS is still a huge favorite to win, for Unibet as well as for my program, with 8/10 vs 7/10 wins for Russia, respectively, but that small difference is exploitable, as a high risk/high reward gamble, since the odds are set for 8/10 wins, not 7/10 wins, as my program predicts.

Let’s zoom in a bit closer to see why it might be that my program puts a bit higher probability to a draw than does Unibet:


Russia – France have met 4 times at world championship level since 2000, and Russia have won all but one, the 2013 game. Now, I have no insights into how Unibet’s odd compiler ranks this type of “anomaly”, but that single loss could be the difference in predictions.

Anyways, it’s interesting enough for me to put some money on that game, yes, the likelihood of winning the bet is fairly small – after all, my program thinks that RUS will beat FRA in more than 7 out of 10 games, but the upside is quite large, thus worth the calculated risk. As stated above, sports betting has loads of Unknown Unknowns, and who knows, perhaps I’m getting lucky….? 🙂

About swdevperestroika

High tech industry veteran, avid hacker reluctantly transformed to mgmt consultant.
This entry was posted in Bayes, Data Analytics, Data Driven Management, Gambling, HOCKEY-2018, Math, Numpy, Probability, PYMC, Python, Statistics and tagged , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s