RE: MEDSTATS: Re: Probability puzzle

BXC (Bendix Carstensen)

unread,

Apr 29, 2005, 4:06:01 AM4/29/05

to MedS...@googlegroups.com

Tracy,

Be careful about what probability you are talking about.

Your orginal message was:

> Basically I have 2 populations consisting of 100 subjects
> in each. I know that every subject in population 1 is directly
> related to 1 subject in population 2, for example subject 1 in
> population 1 is related to subject 1 in population 2 etc. How do I
> calculate the probability of matching everyone in population 1 to
> someone who is NOT related in population 2?

This probability is 1/e=0.3679.
I think Ted computed the complementary probability that at least one had

a correct match.

The brute force way of working this out is by firing up R and saying:
> mm <- numeric( 100000 )
> for( i in 1:100000 ) mm[i] <- sum( sample( 1:100, 100 ) == 1:100 ) ==
0
> mean( mm )
[1] 0.36761

Bendix

> -----Original Message-----
> From: Tracy Clegg [mailto:tracy...@ucd.ie]
> Sent: Thursday, April 28, 2005 10:12 AM
> To: MedS...@googlegroups.com
> Subject: MEDSTATS: Re: Probability puzzle
>
>
>
> Hi Ted,
>
> Sorry for not replying before now - I had gone home by then.
> The subjects in population 1 are only related to exactly one
> subject in population 2 - so from your response and Euler's
> paper I take it the answer is 0.63 - which is a lot higher
> than I thought it would be, but as you say I'm also easily
> surprised with the "birthday problem". Once again many
> thanks for all your time and help.
>
> Tracy
>
>
>
> -----Original Message-----
> From: Ted Harding [mailto:Ted.H...@nessie.mcc.ac.uk]
> Sent: 27 April 2005 18:05
> To: MedS...@googlegroups.com
> Subject: MEDSTATS: Re: Probability puzzle
>
>
>
> On 27-Apr-05 Tracy Clegg wrote:
> > Thanks, Ted, Bendix, Leonardo and Emma for all your help.
> >
> > I think I finally have an answer based on Ted's first example, and
> > Bendix's response. It was certainly not as simple as I thought it
> > might be and I'm glad I consulted the experts! The paper Bendix
> > referred to is a fascinating read - thank you all for your time and
> > help.
> >
> > Tracy
>
> Thanks, Tracy. You still have not comfirmed the point I
> raised in my response: Is it the case in your 'real life
> medical analysis problem' that each person in Population 1 is
> related to 1 and only 1 person in Population 2? (See below).
>
> This is critical to the validity of the answer, and if it is
> not the case then we can offer more help. I feel slightly
> concerned ...
>
> > On the assumption that each subject in Pop1 is related to *exactly*
> > one subject in Pop2 *and*no*more*than*one* (e.g. S1 in P1
> is related
> > to S1 in P1 and no-one else, and vice versa; and so on for
> all), then
> > this is the classic "letters in envelopes" problem
> (Secretary 1 types
> > the letters, Secretary 2 types the addresses on the envelopes, and
> > Secretary 3 puts letters in envelopes without checking
> addresses). It
> > is more generally called "The Matching Problem". [...]
> > However, if it is not the case that the relatedness is strictly
> > one-to-one (i.e. at least i person in Pop1 is related to 2 or
> > more in Pop2, or vice versa) then the answer is much less
> > straightforward and indeed is not defined unless one knows
> > who is related to whom throughout the entire set of subjects.
> > It is not clear from your statement whether it is strictly
> > one-to-one or whether one-to-more-than-one is allowed.
> > If the latter is the case then please get back to us with more
> > details!
>
> Best wishes,
> Ted.
>
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.H...@nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 27-Apr-05 Time: 16:13:34
> ------------------------------ XFMail ------------------------------
>
>

Ted Harding

unread,

Apr 29, 2005, 5:04:58 AM4/29/05

to MedS...@googlegroups.com

On 29-Apr-05 BXC (Bendix Carstensen) wrote:
>
> Tracy,
>
> Be careful about what probability you are talking about.
>
> Your orginal message was:
>
>> Basically I have 2 populations consisting of 100 subjects
>> in each. I know that every subject in population 1 is directly
>> related to 1 subject in population 2, for example subject 1 in
>> population 1 is related to subject 1 in population 2 etc. How do I
>> calculate the probability of matching everyone in population 1 to
>> someone who is NOT related in population 2?
>
> This probability is 1/e=0.3679.
> I think Ted computed the complementary probability that at least one
> had
>
> a correct match.
>
> The brute force way of working this out is by firing up R and saying:
>> mm <- numeric( 100000 )
>> for( i in 1:100000 ) mm[i] <- sum( sample( 1:100, 100 ) == 1:100 ) ==
> 0
>> mean( mm )
> [1] 0.36761
>
> Bendix

Bendix is correct! My error arose at the stage of:

> Thus the result is
>
> 1 - 1/2! + 1/3! - 1/4! + ... + ((-1)^(n-1))/n! [***]
>
> which is the truncation to (n+1) terms of a formula for
> the value of 1/e. Its error is smaller than the last term
> retained.
>
> What we have is the probability P(E1 or E2 or E3 or ... or En)
> of at least one match. So the probability of no matches is
> 1 - this = 1 - 1/e = (e-1)/e.

whereas in fact (I was getting a bit hasty here -- too much typing
of mathematics in "plain text" -- see "Attachments" thread!) the
series for 1/e is

1 - 1 + 1/2! - 1/3! + 1/4! - ...

so in fact the result at [***] above is

"the truncation to (n+1) terms of a formula for the value of" (1 -1/e),

so

"the probability of no matches is 1 - this = " 1 - (1 - 1/e) = 1/e,

as Bendix found.

Apologies!

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.H...@nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861

Date: 29-Apr-05 Time: 10:04:10
------------------------------ XFMail ------------------------------

Tracy Clegg

unread,

Apr 29, 2005, 5:15:35 AM4/29/05

to MedS...@googlegroups.com

Bendix,

Thanks very much for that, I am looking to calculate the probability of NOT
finding a match. I suppose the one thing that has confused me is that when
I tested the series that you gave i.e. 1 - 1/2! + 1/3! - 1/4! + 1/5! - 1/6!
I got 0.632. In your mail you said this approximated 1/e, which is 0.368,
this confused me a little (which is easily done!) as to which was the
complimentary one - until Ted's mail - or so I thought.

So just to confirm the probability of finding NO matches should be 1/e which
is 0.37?

Thanks for your help

Tracy

BXC (Bendix Carstensen)

unread,

Apr 29, 2005, 5:37:26 AM4/29/05

to MedS...@googlegroups.com

> -----Original Message-----
> From: Tracy Clegg [mailto:tracy...@ucd.ie]
> Sent: Friday, April 29, 2005 11:16 AM
> To: MedS...@googlegroups.com
> Subject: MEDSTATS: Re: Probability puzzle

[snip]

> So just to confirm the probability of finding NO matches
> should be 1/e which is 0.37?

Yes, as also the computing example in my mail suggests.
100,000 simulations cant be wrong, but if you are not familiar with R
it's just mumbo-jumbo, of course.

Bendix

[snip]

Reply all

Reply to author

Forward