She went on later to talk about the Tuesday's child problem, which is even more interesting and confounding.
I have a friend who has two children, he has a son born on a Tuesday, what is the probability his other child is a girl?
Compared with this problem
I have a friend who has two children, one I know is a boy, what is the probability the other child is a girl?
And this one
I have a friend who has two children, the oldest is a boy, what is the probability the other child is a girl?
Why do these three probabilities differ (in particular, since the last two may be more obviously different, why is the first one equal to neither of the others)?
I spent the best part of a week in grad school pondering this until I came up with a deeper intuition. But the results are still surprising, I think.
Maybe it's easier to think of coin tosses. Think that we are sampling from a huge population of two coin tosses.
The distribution in order of toss:
HH 1/4
HT 1/4
TH 1/4
TT 1/4
Problem two: Two coins were tossed, one is heads. What is the probability the other is tails?
Looking at the first three rows of the distribution, weight of rows where there's one tails is 2/4 while there's only one row with both heads, whose weight is 1/4. So (2/4)/(3/4) = 2/3 probability of tails.
Problem three: Two coins were tossed, first was head. So we look at the first two rows of the distribution. HT and HH each have equal weight so it's (1/4)/(1/4+1/4)=1/2.
First problem: Two coins were tossed, one was tossed on tuesday and is a heads. What is the probability that the other is tails?
Let's assume all coins are always tossed on random days as well. We are now interested in a sub-population where one was tossed on a tuesday and was heads.
If we tabulate all combinations in the population that we are interested, x axis is first coin toss week day and y axis is second coin toss weekday:
Edit: to put it in words, the slight asymmetry is that if the other child was also born on tuesday, from the population where both children were born on a tuesday there are more boy-girl + girl-boy samples than boy-boy samples.
Yup. Also, have a look at my response to javajosh, below, I go one step further in describing my 'intuition' for what is happening, which generalises beyond these three cases.
My intuition is totally failing me here, because I can't get over the 1/2. If I have a friend who has one child, and he's about to have another, what's the probability that it will be a girl? Clearly 1/2. What's the difference between asking any of these questions and the one I just asked? I can't find any difference.
Right, that's the last case. The eldest is a boy (or a girl, it doesn't matter), so the younger is equally likely to be either.
The middle case is 2/3 likely to be a girl. Think of the possibilities of the sex of two children (sex given in birth order) M then M, M-F, F-M, F-F. I tell you one is a boy, so we have MM, MF, FM, in two of the three cases, the other is a girl.
In the first case the answer is 14/27 a girl, so almost 1/2, but not quite. To figure this out you can consider all 196 combinations of two children with birth-days-of-the-week, cross out all those that don't match the phrase "he has a son born on a Tuesday", and you're left with 27 possibilities: 14 of them are when the other is a girl (14 because she could be born first or second, and on any day of the week 2x7=14), and 13 when the other is a boy (6 when the other is an older boy born on a non Tuesday, 6 when it is a younger boy born on a non Tuesday, and 1 when both are both on a Tuesday).
So that's not the intuition, as much as the answer.
The intuition is that as you add more specific knowledge about which child you mean, that question contributes less to the probability of sex, and is more an irrelevant detail. Being more specific can be anything. Do it with birthday-of-the-year, (more specific than day of the week) and it is even closer to 1/2. Do it with something unique, like their phone number (or their birth order, or 'the taller one is a boy') and it is exactly 1/2. Be less specific, make it so they both toss a coin and it is 'one of the children tossed heads and is a boy' and it is 3/7.
Maybe I begin to see the light. First, there is an important difference between considering a random event on it's own and several random events together: in essence, when your friend says they have a boy, he is telling you a little something about two ordered events. That's why you count the BF and FB cases separately (which is the counter-intuitive part).
The actual Tuesday problem requires a little more thought. Somehow that extra information makes you drop a case! I can't help but think that the ordering is the critical factor, and that it's not a coincidence that the "extra information" has to do with time. In other words, the dropped case happens because we drop the distinctive ordering when the two boys are born on a Tuesday.
You picked up on a really important issue that is often assumed. Thank you for not letting me just slip it under the radar without question.
To answer your question: we treat the MF and FM differently, just for ease of calculation.
What we're doing is taking a set of possible situations (different 2-child families), each with an attached probability (I'll come back to this), throwing away those that don't match some known information (i.e. those without a boy), taking those that do, grouping them into sets (the set of families with the other being a girl, and the set where the other is a boy), and calculating the probability the actual scenario was in each of those sets.
Now the 'attached probability' probably sounded odd in there, because that's not what we've done, right? We've just been counting situations. True, but that's because we carefully started with situations that are all equally likely. So rather than bugger about with those attached probabilities, we could just use raw counts, knowing that it would work.
So, if we start with four possible sexes of two children: MM, MF, FM, FF, those are all equally likely. I'm using the birth order as a way of breaking the two MF cases apart, so I've got two cases that are equally likely, letting me do the counting trick. I can just count the 2 out of 3 remaining situations that match the information, so 2/3. Easy.
But I don't have to do it that way.
Let's say we follow your intuition and start with three possible sex combinations: MM, MF, and FF. Now, those aren't equally likely, right? There are twice as many families with MF children as with MM children. So I need to use the attached probabilities.
MM (1/4 of 2-child families), MF (1/2), FF (1/4)
We can remove the FF as inconsistent with the information (as before), then the probability the parent has an MF family (i.e. their other child is a girl) is
1/2 over (1/2 + 1/4) = 2/3
as before.
In fact, for any reasonable calculation based on data, we're almost certainly not going to be able to use the counting trick. In the case where I was using actual birth rate statistics for sex (rather than assuming as many girls as boys are born), I'd have to do it this way. The counting trick is really only useful for toy problems and teaching basic probability: you were right to call me out on it.
---
The Tuesday's child doesn't drop a case. There are certainly seven cases where, in an MM family, the eldest is a Tue child, and seven where the youngest is, but one of those cases is shared: a case where both are Tue children.
In terms of the intuition you can think of it this way, identifying by birth-day-of-the-week is identifying which child is which, so we're almost at 1/2, but there is one case (both Tue boys) where this information does nothing to distinguish them, so that tiny bit of ambiguity remains, and it can't quite reach the full 1/2, there's still a trace of the 1/3 result that came when the ambiguity was total.
"""Suppose that you already knew that Mr. Smith had two children, and then you meet him on the street with a boy he introduces as his son. In that case, the probability the other child is a son would be 1/2, just as intuition suggests. On the other hand, suppose that you are looking for a male beagle puppy. You want a puppy that has been raised with a sibling for good socialization but you are afraid it will be hard to select just a single puppy from a large litter. So you find a breeder who has exactly two pups and call to confirm that at least one is male. Then the probability that the other is male is 1/3.
In the scenario of Mr. Smith, you’re randomly selecting a child from his two children and then noticing his sex. In the puppy scenario, you’re randomly selecting a two-puppy family with at least one male."""
The first example isn't very good. I see what they're trying to do there, to collapse the ambiguity about which child is the boy. But if you work from the set of possible scenarios just purely from those that match the description, without assuming any methods or probability distributions on the way Mr Smith selected the child to go walking with, you get 1/3.
I advise students to always work this through from the combinations of possible scenarios.
The deeper truth of the vos Savant story is that verbal descriptions are ambiguous, and the even deeper truth is that probability calculations are highly volatile on information.
Eventually disagreements come down (among people who can competently do the calculations) to arguments about what such and such a phrase means, or what underlying probability distribution it implies.
Hence in the Monte Hall Problem, there is no advantage to switching if Monte selects the unchosen door to open at random, and it happens to be a goat. That situation is consistent with the description, but perhaps not the 'feel' of the game show setup.
In science, it is wise to be aware of this. If you have probability calculations that are so volatile based on interpretation, then you probably want to avoid trusting the results, no matter how careful your hermeneutic.
"Hence in the Monte Hall Problem, there is no advantage to switching if Monte selects the unchosen door to open at random, and it happens to be a goat. That situation is consistent with the description, but perhaps not the 'feel' of the game show setup."
This isn't quite right. Obviously if there's a chance Monty might open the car, it screws up the gameshow, but it doesn't actually change the odds in the situation that he opened a door with a goat.
The knowledge in Monte's head doesn't change the odds.
To simulate this properly, you'll need to change the problem statement somewhat... you'll either have to throw out cases where he opened the door with the car, or state the problem such that we're only look at cases where he randomly opened a door with the goat. But you'll still get 2/3.
"But you'll still get 2/3.". No you don't, you get a 1/2. If you consider only those cases where Monty reveals a goat (and discard those where he reveals the car) then 50% of the time you've already picked the car
That said, I'm not sure I completely agree with this solution just throwing out cases where he opens the winning door, without replacement. I feel like at this point we're calculating a different problem where we haven't precisely defined what it is we're measuring, so you could tweak things to get either answer.
I have a friend who has two children, he has a son born on a Tuesday, what is the probability his other child is a girl?
Compared with this problem
I have a friend who has two children, one I know is a boy, what is the probability the other child is a girl?
And this one
I have a friend who has two children, the oldest is a boy, what is the probability the other child is a girl?
Why do these three probabilities differ (in particular, since the last two may be more obviously different, why is the first one equal to neither of the others)?
I spent the best part of a week in grad school pondering this until I came up with a deeper intuition. But the results are still surprising, I think.