More Mindbending Probability

In a previous post, I discussed the seemingly unintuitive logic of the famous Monty Hall Problem. However, with some careful thinking, even without resorting to Monte Carlo Simulation, I’m able to make sense of that apparent paradox.

However, the paradox presented here, is *really* mindbending – personally, I’m unable to wrap my head around the result, even after having verified it by following the rules of probability theory, and by running a Monte Carlo Simulation – the result is still hard to accept! Even the stat’s professionals have problems with this one – the interested reader can find even more confusing info regarding this problem here.

If you happen to have a great intuition for how to understand this paradox, please share it in the comment section below!

The problem is taken from Brian Clegg’s brilliant book “Dice World” (much recommended!) and goes by “Born on a Tuesday”, and despite the arguments in the wiki-link above, I’m going to stick to the interpretation of “at least one boy” given in Clegg’s book:

Assume someone tells you : “I have two children. One is a boy born on a Tuesday. What’s the probability that I have two boys ?”

Your immediate intuition might tell you that the Tuesday part of the problem is just a red herring, and that the sought after probability is 1/2, since it seems “obvious” that knowing that one child is a boy, the only uncertainty is the gender of the second child, so surely that must be 1/2…?

Nope. It might be of some value to look at the outcome space for the problem: with 2 children you could have:





that is, the outcome space consists of 4 separate outcomes, of which 3 are consistent with the fact that the parent has at least one boy. And of those 3 consistent outcomes, there’s only one outcome that matches the problem statement, i.e. the probability of the parent having two boys. So, the probability asked for is 1/3.

Let’s verify this by a Monte Carlo Simulation (imports omitted for brevity):

# generate pairs of children, random gender, random day of week #

def gen_children():
    gender = np.random.randint(0,2,2)
    day = np.random.randint(1,8,2)
    return gender,day

siblings = []

iterations = 1000000

for i in range(iterations):

df = pd.DataFrame(siblings)

df['gender_1'] = df[0].apply(lambda x : x[0])
df['day_1'] = df[1].apply(lambda x : x[0])
df['gender_2'] = df[0].apply(lambda x : x[1])
df['day_2'] = df[1].apply(lambda x : x[1])

The above code gives us a dataframe of 1M pairs of kids, with random gender and weekday of birth as follows:

1 million pairs of kids

next, let’s find the pairs with at least one boy, and pairs where both are boys, and from them compute probability for two boys given we know that one of the children is a boy:

at_least_one_boy = df.loc[ ( df['gender_1'] == 1 ) | ( df['gender_2'] == 1 ) ]

both_boys = at_least_one_boy.loc[ ( at_least_one_boy['gender_1'] == 1 ) &\
                                 ( at_least_one_boy['gender_2'] == 1 ) ]

print ('P(both boys | one boy) : ',len (both_boys) / len(at_least_one_boy))

P(both boys | one boy) :  0.3330825031253711

Indeed, the probability (when the problem statement is interpreted the way Clegg uses it) is 1/3.

However, this answer does not cater for the Tuesday part of the problem, the part that I discarded above as a Red Herring… And this is where the problem becomes – at least for me – extremely unintuitive … because it turns out that including the Tuesday part into the problem,believe it or not (and I still have a hard time believing it!) actually changes the probability….!

So let’s compute that probability using Monte Carlo Simulation as well:

at_least_one_boy_born_tuesday = df.loc[ ( ( df['gender_1'] == 1 ) & ( df['day_1'] == target_day ) ) | \
 ( (df['gender_2'] == 1 ) & ( df['day_2'] == target_day ) ) ]

two_boys_given_at_least_one_born_tue = \
at_least_one_boy_born_tuesday.loc[ ( at_least_one_boy_born_tuesday['gender_1'] == 1) &\
  (at_least_one_boy_born_tuesday['gender_2'] == 1 )]

print ('P(two boys given at least one born Tuesday) : ',
       len (two_boys_given_at_least_one_born_tue) / len(at_least_one_boy_born_tuesday))
P(two boys given at least one born Tuesday) :  0.482585086152667

Wow….! Exactly as Clegg shows in his book, the probability now changes to slightly below 0.5….!

To figure out what happens analytically, let’s enumerate the first few outcomes of the outcome space

Boy (Mon) Girl (Mon) (1)

Boy(Tue) Girl (Mon) (2)

Boy(Sun) Girl (Mon) (7)

Girl(Mon) Girl(Mon) (8)

Girl(Tue) Girl(Mon) (9)

Girl(Sun) Girl(Mon) (14)

So there’s 14 (2 x 7) combinations of boy and girl born on each day of the week, matching a girl born on Monday.

In total there are 2 * 7 * 2 * 7 (196) possible combinations in the outcome space: gender[child 1] * day[child 1] * gender[child 2] * day[child 2].

Out of these 196 possible outcomes, we now need to figure out how many of them feature a boy born on a Tuesday.

I’m too lazy to draw the entire matrix on paper, instead, let’s use Python to compute all the 196 possibilities:

#### analytic calculation ####
import itertools as it

# gender A, day A, gender B, day B # 
l = [[0,1],[1,2,3,4,5,6,7],[0,1],[1,2,3,4,5,6,7]]

# cartesian product # 
outcome_space = list(it.product(*l))

outcome_space = pd.DataFrame(outcome_space,columns=['gender_A','day_A','gender_B','day_B'])

outcome space
# at least one child is boy born Tuesday #

boy_born_tue = outcome_space.loc[ ( ( outcome_space['gender_A'] == 1 ) & \
                                  ( outcome_space['day_A'] == target_day ) ) | \
                                  ( ( outcome_space['gender_B'] == 1 ) & \
                                  ( outcome_space['day_B'] == target_day ) ) ]

two_boys_at_least_one_born_tue = boy_born_tue.loc[ ( boy_born_tue['gender_A'] == 1 ) &\
        ( boy_born_tue['gender_B'] == 1 ) ]

print ('P(two boys at least one born Tue) : ',len (two_boys_at_least_one_born_tue) / len(boy_born_tue))
P(two boys at least one born Tue) :  0.48148148148148145

Wow…! Unfortunately, I have to say that I really do not understand (analytically) how come the day of week changes things, but it sure does. Anyways, I’m in good company, since there seems to have been lot’s of heated debate among folks who are experts on math’s and probability, and as far as I can tell, the jury is still out….

The moral of the story is (again!) that “common sense” and “intuition” aren’t of much help when dealing with probability, it’s extremely easy to shoot your foot, even for PhD’s apparently.

About swdevperestroika

High tech industry veteran, avid hacker reluctantly transformed to mgmt consultant.
This entry was posted in Math, Numpy, Pandas, Probability, Python and tagged , . Bookmark the permalink.

2 Responses to More Mindbending Probability

  1. Risking to become more of a ‘stalker’ than a ‘fan’ but I really liked the casus or dilemma or riddle that you presented. I too was totally baffled by the outcome of this (sheer impossible) ‘thought experiment’. In trying to find a great explanation and visualisation of this problem I hoped to find an explanation video from one of my favourite Youtube channels on matters like these, 3Blue1Brown, Veritasion and Vsauce, but couldn’t find one. So I reached a dead end on ‘understanding’ and ‘feeling’ the problem. I did however stumbled upon a mathematical pretty simple and elegant explanation that I couldn’t keep from you: from ‘MindYourDecisions’.

    Looking forward to your response and possible other video/explanations/appraoches!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s