Scientific Gambling – how do betting shops make money….?

Betting shops are commercial businesses, that is, they want to and must make money in order to survive. Like any other business. So take a casino as an example: they make money – in the long run – by having set the odds just a tiny bit in their favor, the typical “house advantage” in games like Roulette is 5.26% (American Roulette) and 2.70% (European Roulette). What the house advantage tells us, is the relative amount a player is expected to lose for each play.  In the long run. Thus, making a bet of 1$, you should expect to lose about 5 or 3 cents, each time. Over time, these tiny wins for the casino accumulate to quite a lot of money. I have no idea what turnover per day the typical Las Vegas casino has, but let’s say 10.000.000 USD. 5% of that is 500.000 USD, not bad for spinning a few wheels…

The cool thing about games like Roulette is that they are based on “known unknowns”, where the risk is fully understood mathematically, i.e. all the probabilities involved are fully known.

In sports betting, on the other hand, the probabilies are not known, since we are dealing not with risk, but uncertainty, i.e. we are dealing with Unknown Unknowns. So how do betting shops like Unibet, Svenska Spel and others make money on sports betting…?

Easy: just apply a markup to the probabilies/odds: Below an illustration, from a simulation run on my computer:

The blue line shows the cumulative returns given “fair odds”, i.e. odds that are a direct reflexion of the probability of the game outcome: for instance, if the probabilities for a given game are believed to be 1/3 each for WIN, DRAW, LOSS, then each of these outcomes would have odds set to 3 (decimal), or 1:2 (fractional) , that is, you’d get 3$ back for your 1$ stake, if you happened to win.  As can be seen from the graph, after a million games the blue line ends up a little bit above 0, meaning that the player, i.e. you, in this case would leave the casino – or betting shop – with a small gain.

The red line shows what happens to expected returns when I have applied a tiny markup to the probabilities/odds, as all betting shops or casinos do: the line grows almost monotonically towards the negative side, i.e. constantly accumulating wins for the casino/betting shop. That’s the “house advantage” at play.  In this simulation run, I’ve set the house advantage, or markup, to a rediculously low value, regardless, the result is clear, the house is making money.

Anyone wanting to take a guess on the house advantage set in this example…? 🙂

But remember that in sports betting, we are dealing with Unknown Unknowns. That means that to safeguard against potentially huge losses due to all the uncertainties involved in sports game outcomes, the betting shops need to apply a fairly hefty markup to their odds, otherwise the would run a clear risk of going bankrupt in the case when they have set way too high odds.

“Prediction is difficult, particularily about the future”.



Posted in Bayes, Data Analytics, Gambling, HOCKEY-2018, Math, Numpy, Probability, PYMC, Pystan, Python, Simulation, Statistics | Tagged , , , , , , , , , , , | Leave a comment

Scientific gambling – How to identify potentially profitable odds/plays ?

In all sports gambling, success or failure is determined by a number of factors, luck not being the least of them, since in any sport there are loads of “Unknown Unknowns“, which we could also call “Uncertainty”. And then there is randomness.

If we compare sports betting with casino type betting, in casino betting we can talk about “Risk” instead of “Uncertainty”, since for the casino games like roulette etc we can mathematically calculate all the probabilities involved. However, in sports betting, no matter how much data we have, we have no way to mathematically decide the exact odds, because there are many Unknown Unknowns involved in each game. Thus, sports betting is a truly probabilistic endeavor, fully dominated by uncertainty, asop to calculable risk.

But, despite the presence of the Unknown Unknowns, we can attempt to do the best we can, by utilizing data that we have available, such as rankings, historical results etc. This is what my Bayesian Inference Engine does, basically calculating the most likely outcomes for any game, given the rankings and historical results (as well as the statistical model I’ve given it).

So, let’s take a concrete example of how I decide upon which games to bet on:

First, I execute my Bayesian Engine on the rankings and historical game data that I’ve gathered, about 1100 world championship games since year 2000 to date. Basically, what the engine does is to do millions on simulations, trying to come up with model parameters that generate the same results as in the historical games. Once these parameters are known, they are used to predict the outcomes of new games.  Such a run can take anything from a few hours to days (or even weeks) depending on how many games the model is predicting, how much data there is, and how “deep” I’ve told it to go in its analysis. A typical run covering each days 4-6 games takes about 10 hours of cpu-time on my machine.

When the Bayesian Inference in the above step is done – typically run overnight – it’s time to analyze the predictions, and to compare the predictions with the predictions of a commercial betting shop, in my case, Unibet, trying to identify games where my program has found an advantage in the odds setting.

The first thing I do, is to check for any games where the probabilities between Unibet’s predictions and my predictions differ:


Instead of looking at de massive amount of numbers in rows and columns, the above graph gives me a quick overview on where, according to my program, there might be an odds setting that is advantageous to me: basically, here I look for the different colored sub-bars, and compare their size. For instance, here, I can see that the game RUS-FRA, my program has identified the probability of a draw quite a bit higher than Unibet, thus, that might be a candidate for placing a bet. Let’s have a closer look:


Above graph shows the differences in probabilities between Unibet’s predictions and my predictions. The one’s that are potentially interesting, are the one’s on the plus-side, i.e. where my program has found a higher probability than Unibet.  RUS-FRA is interesting, as well as SWE-BLR, and SUI-AUT. Let’s dig deeper, using RUS-FRA as our example:


Here we can see that given my program’s prediction of the outcome of RUS-FRA, vs. the prediction on which Unibet bases its odds, there is an opportunity here to make money  – IFF MY PROGRAMS PREDICTION HAPPENS TO BE CORRECT AND THE SMALL MIRACLE OF FRA WINNING HAPPENS! – since according to my program, the probability for a draw as my program sees it, is higher than how Unibet sees it, and therefore, they have set the odds for a draw higher than they should be, as my prediction sees the outcome of that game.

Of course, RUS is still a huge favorite to win, for Unibet as well as for my program, with 8/10 vs 7/10 wins for Russia, respectively, but that small difference is exploitable, as a high risk/high reward gamble, since the odds are set for 8/10 wins, not 7/10 wins, as my program predicts.

Let’s zoom in a bit closer to see why it might be that my program puts a bit higher probability to a draw than does Unibet:


Russia – France have met 4 times at world championship level since 2000, and Russia have won all but one, the 2013 game. Now, I have no insights into how Unibet’s odd compiler ranks this type of “anomaly”, but that single loss could be the difference in predictions.

Anyways, it’s interesting enough for me to put some money on that game, yes, the likelihood of winning the bet is fairly small – after all, my program thinks that RUS will beat FRA in more than 7 out of 10 games, but the upside is quite large, thus worth the calculated risk. As stated above, sports betting has loads of Unknown Unknowns, and who knows, perhaps I’m getting lucky….? 🙂

Posted in Bayes, Data Analytics, Data Driven Management, Gambling, HOCKEY-2018, Math, Numpy, Probability, PYMC, Python, Statistics | Tagged , , , , , , , , , , , | Leave a comment

Scientific Gambling on Ice Hockey Worlds – identifying potentially exploitable games

One of the most difficult aspects of dealing with lots of data, is to present the information obtained from various computations in a clear and meaningful way. For instance, in order to identify games where there is a potentially exploitable gambling advantage between the odds given by Unibet, and the probabilities obtained from my Bayesian engine, I must compute all predited game outcomes (WIN/DRAW/LOSS) for both Unibet and my system, and then identify potential exploitable differences. The graph below is a new attempt to consolidate all this information into a single graph.
Each game has potentially 4 bars. The 3 leftmost are predictions from my system, where the third is the average prediction of the two statistical models I use. The rightmost bar (where it exists) is Unibet’s odds converted to probabilities (taking into account the markup that all betting shops place on their odds).
So, with this graph, the basic process to identify potentially exploitable games is to compare each of the colored sub-bars from my program, with the corresponding Unibet bar. Those bets where any of my sub-bars are taller than Unibet’s, are the high risk/high reward games that might be exploitable.
[there are two reasons not all games have all four bars: if there’s no previous historical games between the two teams, like for FIN-KOR, or if Unibet have not yet publicized their odds, as for FRA-BLR and DEN-USA]
Posted in Bayes, Data Analytics, Data Driven Management, Gambling, HOCKEY-2018, Math, Numpy, Probability, PYMC, Python, Statistics | Tagged , , , , , , , , , , , , | Leave a comment

Scientific Betting on Ice Hockey Worlds now on Facebook

Scientific Gambling on Ice Hockey World Championships 2018


Posted in Bayes, development, Gambling, HOCKEY-2018 | Tagged , , | Leave a comment

Scientific Gambling on Hockey Worlds – Expected profits from games of day 1 & 2

An Expected Value-calculation gives the expected gains from my bets on the games played during the first two days of the tournament as follows:

          OUTCOME  U_ODDS       U_P         P   P_DELTA  EV_PER_UNIT
CZE  SVK     DRAW    5.20  0.192308  0.243738  0.051430     0.267438
     SVK     LOSS    5.80  0.172414  0.169599 -0.002814    -0.016324
GER  DEN     DRAW    4.25  0.235294  0.274173  0.038879     0.165237
     DEN     LOSS    3.20  0.312500  0.341512  0.029012     0.092838
NOR  LAT     DRAW    4.00  0.250000  0.291737  0.041737     0.166949
RUS  FRA     DRAW   10.00  0.100000  0.184241  0.084241     0.842407
SUI  AUT     DRAW    5.50  0.181818  0.320461  0.138643     0.762537
     AUT     LOSS    6.75  0.148148  0.147649 -0.000499    -0.003367
SWE  BLR     DRAW    8.00  0.125000  0.209213  0.084213     0.673707
     BLR     LOSS   11.00  0.090909  0.149934  0.059025     0.649274

In the table above, the ‘OUTCOME’ column shows my bet, from the perspective of the ‘home’-team, ‘U_ODDS’ are the odds given by Unibet, ‘U_P’ are those odds converted to a probability, ‘P’ is the probability that my Bayesian model gives to the game, ‘P_DELTA’ the difference in Unibet’s and my programs belief on the outcome probability, and ‘EV_PER_UNIT’ the expected value per unit stake.

For these initial games, the calculated profit margin for my bets is 22%

Below the Expected Value in graphical form.

(The reason FIN-KOR has no bars in the graph is that those two teams have never met before at championship level, and thus not enough data for meaningful inferences)


Posted in Bayes, Data Analytics, Data Driven Management, Gambling, HOCKEY-2018, Numpy, Probability, PYMC, Python, Statistics | Tagged , , , , , , , , , , , | Leave a comment

Scientific Gambling on Ice Hockey Worlds – Bets for games of May 5th



I’m using mathematical & statistical methods, more specifically, Bayesian Inference, Markov Chain Monte Carlo, simulation and Probabilistic Programming, attempting to predict the game outcomes of the upcoming Ice Hockey World Championships, starting May 4th.

Based on the findings of my computations, I’m betting, and thus putting some Skin-In-The-Game,  on selected high odds games.


To summarize my betting strategy: take a lot of calculated risk by only betting on high odds games, thereby expecting to loose most of the bets, but the few expected wins at high odds will hopefully compensate for the many losses. Hedge where Bayesian Inference indicate so. 


Bets placed for May 5th’s games

Games, Bets & Unibet Odds

  • NOR – LAT, @ DRAW, odds : 4.00
  • SUI – AUT, @ DRAW, odds : 5.50
  • SUI – AUT, @ AUT WIN, odds : 6.75
  • CZE – SVK, @ DRAW, odds : 5.20
  • CZE – SVK, @ SVK WIN, odds : 5.80

Previously placed bets

May 4th’s games

  • RUS – FRA, @ DRAW, odds : 10.00
  • SWE – BLR, @ DRAW, odds : 8.00
  • SWE – BLR, @ BLR WIN, odds : 11.00
  • GER – DEN, @ DEN WIN, odds : 3.20
  • GER – DEN, @ DRAW, odds : 4.25


Posted in AI, Bayes, Big Data, Business, Data Analytics, Data Driven Management, Gambling, HOCKEY-2018, Math, Numpy, Probability, PYMC, Python, Statistics | Tagged , , , , , , , , , , , , , | Leave a comment

Bayesian Prediction – wanna bet…? Putting your money where your mouth is…

[A disclaimer: I know virtually nothing about contemporary ice hockey, my interest faded when Börje Salming decided to put his skates on the shelf for a couple of decades ago, so I have not included any personal hockey insights into the predictions, e.g. current form etc of the teams, my predictions are purely based on:

  • 1) Stats & Mathematics
  • 2) current IIHF Ranking table
  • 3) about 1100 historical championship game results from 2000 and forward
  • 4) a few thousand lines of Python and PYMC code ]

So, after almost four weeks of intense development, my Bayesian Inference Engine for the soon-to-start IIHF 2018 World Championships is about ready for action.  I believe.

Thus, I will use it to predict the outcome of each game during the upcoming championships using its Bayesian Inference Model, and publish the predictions – obviously ahead of the game..! – here.  All this in the hopes of making some serious money by betting according to the predictions from my statistical model….

That objective leads to a well defined betting strategy: I’m going focus solely on high-odds games, thus taking a large amount of hopefully well calculated risk, expecting thereby to make huge gains on those bets that actually go my way.  This strategy means I’m not going to play any of the low odds games at all, no matter how “safe bets” they do appear.

To summarize the strategy: take a lot of calculated risk by only betting on high odds games, thereby expecting to loose most of the bets, but the few expected wins at high odds will hopefully compensate for the many losses. 

So, for those games where my program predicts a result that indicates an advantage over the odds of a specific professional betting shop, Unibet, I will make bets according to the predictions of my program, and publish the outcomes of those bets here.  That’s simpy “Skin-In-The-Game“, or “Putting Your Money Where Your Mouth is”, that is, standing up for one’s predictions/beliefs with real skin in the game, asop to being just a normal pundit with nothing to loose by erroneous predictions…. 🙂

Anyways, the championships start May 4th, with 4 games [Odds @ Unibet W/D/L]

  • Russia – France  ODDS:[1.12/10/15]
  • USA – Canada  ODDS:[5.30/5.10/1.47]
  • Sweden – Belarus  ODDS:[1.18/8/11]
  • Germany – Denmark ODDS:[1.94/4.25/3.20]

So, let’s compare the odds above given by Unibet, with the predictions from my statistical inference. In fact, my program computes two predictions, one based on the historical spreads, the other based on scores of each individual historical game between the teams, thus two graphs below. (Btw, don’t bother about the x-axis values, they do not correspond to anything real, at least not directly – they are scaled in various ways to make the prediction (hopefully) better…)

Anyways: looking at the graphs below, one of the outputs from my program, both RUS-FRA and SWE-BLR seem very much ín line with the Unibet predictions, thus no point in playing those low odds.  Same goes for USA-CAN, where my program gives even higher odds than Unibet for USA winning or a draw.

But there’s one (1) game that looks more interesting from “trying to make money on betting”-perspective: GER-DEN: the odds given by Unibet on that game to be a draw are 4.25, while the average of my two models predicts 3.4 for a draw.  Furthermore, DEN winning will give 3.20 according to Unibet, while my model predicts about 2.7.

Decision time: I’ll bet 1 unit on draw, and 1 unit on DEN winning !

(how much I’m actually betting in real money will remain my secret – after all, I don’t want the tax authorities after me…! 😉 )

So, my bet will cost me 2 “units”,  and according to my model, I have about 30% chance of winning the draw-bet, which in that case would give me 4.25 “units” back, i.e a win of 2.25 times my stake. And I have about 35% chance of winning my bet on DEN winning the game, giving me 3.20 units back, i.e. a win of 1.20 times my money.

Of course, I also have about 35% chance (or risk) loosing both bets, if GER wins, but no guts, no glory…

Another way to put it is in terms of EV, expected value, i.e. the expected returns on each “unit” of stake, of course in the long run… :

  • GER – DEN DRAW: EV = 0.29
  • GER – DEN DEN WIN: EV = 0.12

While e.g. USA – CAN has an EV of -0.42, which again confirms my decision not to play on that game.

I’d like to end this post with Disclaimer II:

“Prediction is difficult, particularly about the future”

Results of my betting will be published as soon as the games are finished on may 4th.

[EDIT April 25th: After having written a utility for Expected Value-analysis last night, I decided to add a few bets on the first day games, so the full betting list now looks like:

  • RUS-FRA : bet on draw at odds 10
  • SWE-BLR : bet on draw at odds 8
  • SWE-BLR : bet BLR winning at odds 11
  • GER-DEN : bet DEN winning at odds 3.20
  • GER-DEN : bet on draw at odds 4.25







Posted in Bayes, Business, Data Analytics, Data Driven Management, Gambling, HOCKEY-2018, Math, Numpy, Probability, PYMC, Python, Statistics | Tagged , , , , , , , , , , , | Leave a comment

Bayesian updating with PYMC

I’ve been looking for neat ways to update a Bayesian Prior from a posterior sample for a while, and just the other day managed to find what I was looking for: a code example that shows how to make a posterior to a distribution, known to PYMC. The code for doing that, by jcrudy, can be found here. The key insight with that code is the “known to PYMC”-part: it’s no problem to make a posterior sample to a distribution, but in order to make it compatible with PYMC, it needs to be done as a PYMC object.  And that’s what jcrudy’s code does.

Anyway, below a small example, borrowed from Richard McElreath’s excellent book “Statistical Rethinking”.

Let’s pretend that we wanted to know the ratio of Water to Land of our earth, and that the way we will go about it is to toss an inflatable globe into air a number of times, and keep track of where we first touch it when we catch it, land or water.

Let’s say that after the first 9 tosses, our results look like below:

[1,0,1,0,1,0,1,1,1] where ‘1’ indicates water, and ‘0’ land.

The basic logic of our Bayesian Inference is to update the prior,  each time we get a new data point, i.e. either a ‘1’ or a ‘0’.

Let’s furthermore say that we have an initial idea of the correct ratio being somewhere around 50%, but we are far from certain, so we will set a fairly “loose” initial prior, a Beta-distribution centered on 50%, as below. This prior allows the calculated rate to be almost the entire range of 0% to 100%, with the highest probability around 50%.


Now, what we want to do is to update our prior belief – the one initially being centered around 50% – based on each subsequent data point.

Below the results of 18 of such updates (the above vector of 9 points is concatenated to provide these 18 data points):


This graph is read from left-to-right, and shows the updated prior, or current belief on the correct ration, after each subsequent data point has been processed.

From the topmost left image, we can see that when the first data point, which happens to be a ‘1’, has been processed, the Bayesian model has adjusted its belief on the correct ratio somewhat to the right, from 50% to 55%, and after having processed the ‘0’ in the second data point, reduced the current belief back to 50%. And so on so forth as the remaining data points get processed.

Finally, after having processed all 18 data points, updating the prior for each point, the Bayesian model settles for a ratio of 64% (mean value). But the real benefit of Bayesian analysis is that we actually get a probability distribution as a result, not just a point estimate. Thus, the Bayesian approach preserves the underlying uncertainty of the model in a very neat, visually obvious way.

If you look closely at the 18 different plots above, you will see that the distribution gets narrower as each data point is processed. This means the Bayesian model is gradually getting more and more certain about the sought after ratio, as it gets more data to base its estimate on.

Code below.

Continue reading

Posted in Bayes, Data Analytics, Math, Numpy, Probability, PYMC, Python, Statistics | Tagged , , , , , , , , | Leave a comment

Poor Man’s Betting Shop – Using Baysian Inference to setup your own Betting Shop


Further exploration of Bayesian Inference, applied to the upcoming 2018 Ice Hockey World Championships. This time, I’m trying to understand how the professional betting shops set their odds, and how they make a profit. It took some ‘research’ into the gaming/betting vocabulary, for instance, the notion of “odds” is not unique, there are several types, including fractional e.g. 3:1, and decimal, e.g. 2.0.  In this article, I’m going to stick with the decimal form of odds, so below, whenever you see a number like 2.0 or 1.5, those numbers are decimal gambling odds, and the way to read them is that for every dollar you bet against those odds, you will get that number back, if you happen to win. So basically, your ‘return’, if you happen to win the bet, is your stake multiplied by the odds number.  To make it bleeding obvious, let’s take an example:

You decide to bet 1$ on an outcome with 1.0 in odds. If you win, you’ll get your money back, nothing more. If instead the odds were set to 2.0, you would get 2$ back for your 1$ stake, i.e. a nice winning of 1$ (or 100%).

So, with this now hopefully bleedingly clear and obvious, let’s see what my Bayesian inference engine – which at this point has been fed with nearly 1100 historical game results, as well as the current IIHF Rankings table – thinks about the upcoming World Championship group game outcomes:

Below are two tables, one for group A and one for group B. In each table, each game is listed, with two lines, the first line showing the ‘raw’ odds, based on the actual probabilities for the game, the second line including the ‘markup’ that any professional bookie must make to ensure a healthy business. So, to compare these odds with the one’s from any professional betting shop, you should use line #2 for each game.

I compared the outcomes from my odds compilator, i.e. the Bayesian Inference Engine I’ve written in Python and PYMC, with the odds compiled by a betting brooker on the net, and was pretty surprised to see that in many cases, the odds from my program and the ones on that site are not that far from each other… That’s surprising, because my program – at this point – only uses historical results and the actual rankings to determine the probabilities of the game outcomes, and I’m pretty sure that professional betting shops put much more data into their odds compilation engine, e.g. based on the current team rooster, form, home vs away game, venue, umpire etc etc.

The most obvious differences I’ve been able to notice are two: first, my program appears to place more emphasis on the historical results than the betting shops do. Secondly, for teams with no previous world cup participation, like Korea, my program seems to put less emphasis on the rankings table than the professional shops do.

So, anyways:below the results. Oh, btw, a few games have a right most column in square brackets. Those indicate the odds obtained from the betting brooker mentioned above. The reason there aren’t more of them, is that it appears that the betting folks don’t publish all their odds at once, but ‘piece-by-piece’.

AUT-BLR  : HOME WIN: 545.45 DRAW: 130.43 AWAY WIN: 1.01
         : HOME    : 495.87 DRAW: 118.58 AWAY    : 0.92

AUT-CZE  : HOME WIN: 7500.00 DRAW: 2142.86 AWAY WIN: 1.00
         : HOME    : 6818.18 DRAW: 1948.05 AWAY    : 0.91

AUT-FRA  : HOME WIN: 5.02 DRAW: 5.95 AWAY WIN: 1.58
         : HOME    : 4.56 DRAW: 5.41 AWAY    : 1.44

AUT-RUS  : HOME WIN: 188.68 DRAW: 60.85 AWAY WIN: 1.02
         : HOME    : 171.53 DRAW: 55.32 AWAY    : 0.93

AUT-SUI  : HOME WIN: 9.25 DRAW: 10.31 AWAY WIN: 1.26
         : HOME    : 8.41 DRAW: 9.38 AWAY    : 1.14	[7.20 6.30 1.39]

AUT-SVK  : HOME WIN: 3000.00 DRAW: 389.61 AWAY WIN: 1.00
         : HOME    : 2727.27 DRAW: 354.19 AWAY    : 0.91

AUT-SWE  : HOME WIN: inf DRAW: 3750.00 AWAY WIN: 1.00
         : HOME    : inf DRAW: 3409.09 AWAY    : 0.91

BLR-CZE  : HOME WIN: 267.86 DRAW: 50.85 AWAY WIN: 1.02
         : HOME    : 243.51 DRAW: 46.22 AWAY    : 0.93

BLR-FRA  : HOME WIN: 1.80 DRAW: 5.90 AWAY WIN: 3.63
         : HOME    : 1.64 DRAW: 5.36 AWAY    : 3.30	[2.10 4.34 3.16]

BLR-RUS  : HOME WIN: 17.03 DRAW: 8.90 AWAY WIN: 1.21
         : HOME    : 15.48 DRAW: 8.09 AWAY    : 1.10

BLR-SUI  : HOME WIN: 9.57 DRAW: 5.16 AWAY WIN: 1.43
         : HOME    : 8.70 DRAW: 4.69 AWAY    : 1.30

BLR-SVK  : HOME WIN: 5.44 DRAW: 4.92 AWAY WIN: 1.63
         : HOME    : 4.95 DRAW: 4.48 AWAY    : 1.48

BLR-SWE  : HOME WIN: 24.15 DRAW: 12.18 AWAY WIN: 1.14
         : HOME    : 21.96 DRAW: 11.07 AWAY    : 1.04	[14.25,9.20,1.19]

CZE-FRA  : HOME WIN: 1.13 DRAW: 15.93 AWAY WIN: 19.96
         : HOME    : 1.02 DRAW: 14.48 AWAY    : 18.15

CZE-RUS  : HOME WIN: 3.54 DRAW: 3.46 AWAY WIN: 2.34
         : HOME    : 3.21 DRAW: 3.14 AWAY    : 2.12

CZE-SUI  : HOME WIN: 1.34 DRAW: 5.71 AWAY WIN: 12.85
         : HOME    : 1.22 DRAW: 5.19 AWAY    : 11.68

CZE-SVK  : HOME WIN: 1.27 DRAW: 6.89 AWAY WIN: 14.61
         : HOME    : 1.16 DRAW: 6.27 AWAY    : 13.28	[1.47 6.00 6.70]

CZE-SWE  : HOME WIN: 7.53 DRAW: 3.78 AWAY WIN: 1.66
         : HOME    : 6.85 DRAW: 3.44 AWAY    : 1.51

FRA-RUS  : HOME WIN: 63.56 DRAW: 28.01 AWAY WIN: 1.05
         : HOME    : 57.78 DRAW: 25.46 AWAY    : 0.96	[21.25 13.00 1.12]

FRA-SUI  : HOME WIN: 6.32 DRAW: 5.42 AWAY WIN: 1.52
         : HOME    : 5.75 DRAW: 4.93 AWAY    : 1.38

FRA-SVK  : HOME WIN: 7.36 DRAW: 7.70 AWAY WIN: 1.36
         : HOME    : 6.69 DRAW: 7.00 AWAY    : 1.24

FRA-SWE  : HOME WIN: 123.97 DRAW: 36.36 AWAY WIN: 1.04
         : HOME    : 112.70 DRAW: 33.06 AWAY    : 0.94

RUS-SUI  : HOME WIN: 1.07 DRAW: 18.82 AWAY WIN: 76.73
         : HOME    : 0.97 DRAW: 17.11 AWAY    : 69.75

RUS-SVK  : HOME WIN: 1.42 DRAW: 5.21 AWAY WIN: 9.55
         : HOME    : 1.29 DRAW: 4.74 AWAY    : 8.68

RUS-SWE  : HOME WIN: 1.59 DRAW: 3.89 AWAY WIN: 8.75
         : HOME    : 1.45 DRAW: 3.53 AWAY    : 7.95

SUI-SVK  : HOME WIN: 5.31 DRAW: 5.83 AWAY WIN: 1.56
         : HOME    : 4.83 DRAW: 5.30 AWAY    : 1.42

SUI-SWE  : HOME WIN: 37.59 DRAW: 9.00 AWAY WIN: 1.16
         : HOME    : 34.18 DRAW: 8.19 AWAY    : 1.05

SVK-SWE  : HOME WIN: 6.89 DRAW: 4.91 AWAY WIN: 1.54
         : HOME    : 6.26 DRAW: 4.46 AWAY    : 1.40

CAN-DEN  : HOME WIN: 1.14 DRAW: 14.73 AWAY WIN: 18.69
         : HOME    : 1.03 DRAW: 13.39 AWAY    : 16.99

CAN-FIN  : HOME WIN: 1.97 DRAW: 3.50 AWAY WIN: 4.83
         : HOME    : 1.79 DRAW: 3.18 AWAY    : 4.40

CAN-GER  : HOME WIN: 1.03 DRAW: 40.65 AWAY WIN: 157.89
         : HOME    : 0.94 DRAW: 36.95 AWAY    : 143.54

CAN-KOR  : HOME WIN: 1.21 DRAW: 7.08 AWAY WIN: 28.25
         : HOME    : 1.10 DRAW: 6.43 AWAY    : 25.68

CAN-LAT  : HOME WIN: 1.00 DRAW: 750.00 AWAY WIN: 6000.00
         : HOME    : 0.91 DRAW: 681.82 AWAY    : 5454.55

CAN-NOR  : HOME WIN: 1.01 DRAW: 150.00 AWAY WIN: 681.82
         : HOME    : 0.92 DRAW: 136.36 AWAY    : 619.83

CAN-USA  : HOME WIN: 1.45 DRAW: 5.37 AWAY WIN: 8.08
         : HOME    : 1.32 DRAW: 4.88 AWAY    : 7.35	[1.55 5.50 5.80]

DEN-FIN  : HOME WIN: 184.05 DRAW: 32.64 AWAY WIN: 1.04
         : HOME    : 167.32 DRAW: 29.68 AWAY    : 0.94

DEN-GER  : HOME WIN: 2.98 DRAW: 4.63 AWAY WIN: 2.23
         : HOME    : 2.71 DRAW: 4.21 AWAY    : 2.03	[3.50 4.40 2.05]

DEN-KOR  : HOME WIN: 1.82 DRAW: 2.81 AWAY WIN: 10.41
         : HOME    : 1.66 DRAW: 2.56 AWAY    : 9.46

DEN-LAT  : HOME WIN: 3.05 DRAW: 4.83 AWAY WIN: 2.15
         : HOME    : 2.78 DRAW: 4.39 AWAY    : 1.95

DEN-NOR  : HOME WIN: 6.55 DRAW: 5.03 AWAY WIN: 1.54
         : HOME    : 5.95 DRAW: 4.57 AWAY    : 1.40

DEN-USA  : HOME WIN: 15.22 DRAW: 9.29 AWAY WIN: 1.21
         : HOME    : 13.84 DRAW: 8.45 AWAY    : 1.10	[9.00 7.70 1.35]

FIN-GER  : HOME WIN: 1.17 DRAW: 10.22 AWAY WIN: 20.98
         : HOME    : 1.06 DRAW: 9.29 AWAY    : 19.07

FIN-KOR  : HOME WIN: 1.28 DRAW: 5.87 AWAY WIN: 21.69
         : HOME    : 1.16 DRAW: 5.34 AWAY    : 19.72	[1.04 21.00 42.50]

FIN-LAT  : HOME WIN: 1.38 DRAW: 6.78 AWAY WIN: 7.82
         : HOME    : 1.25 DRAW: 6.16 AWAY    : 7.11

FIN-NOR  : HOME WIN: 1.08 DRAW: 18.58 AWAY WIN: 48.15
         : HOME    : 0.98 DRAW: 16.89 AWAY    : 43.78

FIN-USA  : HOME WIN: 3.28 DRAW: 2.75 AWAY WIN: 3.02
         : HOME    : 2.98 DRAW: 2.50 AWAY    : 2.75

GER-KOR  : HOME WIN: 1.34 DRAW: 4.94 AWAY WIN: 19.60
         : HOME    : 1.22 DRAW: 4.49 AWAY    : 17.81

GER-LAT  : HOME WIN: 1.52 DRAW: 5.32 AWAY WIN: 6.52
         : HOME    : 1.38 DRAW: 4.84 AWAY    : 5.93

GER-NOR  : HOME WIN: 6.90 DRAW: 8.60 AWAY WIN: 1.35
         : HOME    : 6.27 DRAW: 7.81 AWAY    : 1.23

GER-USA  : HOME WIN: 4.73 DRAW: 4.58 AWAY WIN: 1.75
         : HOME    : 4.30 DRAW: 4.17 AWAY    : 1.59

KOR-LAT  : HOME WIN: 12.10 DRAW: 3.15 AWAY WIN: 1.67
         : HOME    : 11.00 DRAW: 2.86 AWAY    : 1.52

KOR-NOR  : HOME WIN: 15.13 DRAW: 4.45 AWAY WIN: 1.41
         : HOME    : 13.75 DRAW: 4.04 AWAY    : 1.28

KOR-USA  : HOME WIN: 21.57 DRAW: 5.47 AWAY WIN: 1.30
         : HOME    : 19.61 DRAW: 4.97 AWAY    : 1.18

LAT-NOR  : HOME WIN: 2.10 DRAW: 4.50 AWAY WIN: 3.31
         : HOME    : 1.91 DRAW: 4.09 AWAY    : 3.01	[2.81 4.40 2.25]

LAT-USA  : HOME WIN: 15.29 DRAW: 5.96 AWAY WIN: 1.30
         : HOME    : 13.90 DRAW: 5.42 AWAY    : 1.19

NOR-USA  : HOME WIN: 30.58 DRAW: 17.50 AWAY WIN: 1.10
         : HOME    : 27.80 DRAW: 15.91 AWAY    : 1.00

Below the prediction graphs for the expected goal spread for the two groups:


Posted in Bayes, Complex Systems, Data Analytics, development, Gambling, Math, Numpy, Probability, PYMC, Python, Statistics | Tagged , , , , , , , , , | Leave a comment

Bayesian Inference 2018 Ice Hockey World Cup outcomes

I’ve tuned my Bayesian model a bit. Previously, it used the cumulative sum of historical results point spread as its data input, now it uses each individual game spread. Perhaps an example can make this clearer:

Consider two teams, A and B, who have met 5 times in the past, with following results:


The above series gives a cumulative spread of +2, which normalized by number of games becomes 2/5 = 0.4 for team A, indicating that team A is more likely to win. However, that result is highly influence by the 10-0 result of the first game, while not putting enough weight on the fact that the four last games team A actually lost.

To cope with that, I changed the model, instead of considering the cumulative spread in its calculations, to consider each historical outcome individually, which in the above example would result in team B being regarded as the winner.

Below the new predicted outcomes and corresponding odds.





Posted in Bayes, Data Analytics, Data Driven Management, Numpy, Probability, PYMC, Python, Simulation, Statistics | Tagged , , , , , , , , , , | Leave a comment