OT: a little probability problem...

Please check the FAQ (https://www.xyplorer.com/faq.php) before posting a question...
Post Reply
admin
Site Admin
Posts: 65185
Joined: 22 May 2004 16:48
Location: Win8.1, Win10, Win11, all @100%
Contact:

OT: a little probability problem...

Post by admin »

Hey, if some of you happens to be a probability math guru, here's a little task I'm not sure how to handle.

There is large pool of different things, each is unique (= no doublettes). Now you random-choose one of the things and throw it back into the pool. You do this exactly 1000 times and it turns out that you never draw a thing that you have drawn before already. Now the question: How many things at least are likely (chance > 0.5) to be in the pool?

TIA!
Don

wogone
Posts: 3
Joined: 10 Jul 2007 22:36

Post by wogone »

I think there need to be 1443 things in your pool! Carry on reading if you care enough for the explanation... :)


Say you have 3 things, and the first time you pick the red one. Every time you pick another thing, there's a 2 in 3 chance you don't get the red one again. So if you do it a thousand times, the chance of never picking the red one again is (2/3) ^ 1000..

You want the probability to be above 0.5, and there are a thousand picks, so xxx ^ 1000 = 0.5 - or 0.5 ^ (1/1000) = xxx = 0.9993

Turning that probability into a number, we have 1 / (1 - 0.9993) = 1443 things. Tadaa!

Now I've curried some favour, I'll PM you some thoughts on what I'd like adding to your software! :)

jacky
XYwiki Master
Posts: 3106
Joined: 23 Aug 2005 22:25
Location: France
Contact:

Post by jacky »

I'm not sure your reasonning is correct.. all probabilities are ususally quite more complex than it may seem at first.
Besides, using your example, it's not picking up the red thing again but never picking up somehting previously picked. And that must therefore include proba of all previously take.

I tried quickly, but it's been a long time since I did such things and I really forgot a lot about it...

My last state is: (x being the number we're looking for (things in pool))

Code: Select all

(x-1)(x-2)(x-3)...(x-998)(x-999) = 0.5x
I don't remember how to move on from there. But if you can, then x will be, or should be, the number of objects in the pool so that doing as described at first, you have 1 chance on 2 (0.5) to still not take a thing you previously picked before on the 1000th times.
I think.

Maybe.

Who knows?
Proud XYplorer Fanatic

admin
Site Admin
Posts: 65185
Joined: 22 May 2004 16:48
Location: Win8.1, Win10, Win11, all @100%
Contact:

Post by admin »

wogone wrote:I think there need to be 1443 things in your pool!
Hm, intuitively I'd say the number must be much higher than that! Imagine you have the numbers from 1 to 1443 (= 1443 different things). I think it is very improbable that you can random-pick numbers from of those for 1000 times without repetition.

admin
Site Admin
Posts: 65185
Joined: 22 May 2004 16:48
Location: Win8.1, Win10, Win11, all @100%
Contact:

Post by admin »

jacky wrote:My last state is: (x being the number we're looking for (things in pool))

Code: Select all

(x-1)(x-2)(x-3)...(x-998)(x-999) = 0.5x
I think you are on the right track here. The problem is closely related to the so-called "Birthday paradox" ( http://en.wikipedia.org/wiki/Birthday_paradox ). I think.

agnul
Posts: 51
Joined: 20 Jul 2006 09:21
Location: Milan, Italy
Contact:

Post by agnul »

Ah, the bad memories. Anyway,

:arrow: the first draw doesn't count, you dont't care what will come out,
:arrow: the second draw you want something different from the first, so you have a (n - 1)/n chance of getting that,
:arrow: for the third draw you want something different from the first two, so you have a (n - 2)/n chance
...
:arrow: for the 1000th draw you have (x - 999)/n

At last you want the sum of all those terms to be greater than 0.5, so

Code: Select all

(n - 1 + n - 2 + n - 3 + ... + n - 999)    1
--------------------------------------- >= - 
                n                          2

                      n
999n - sum(1, 999) >= - 
                      2

       n    1000 * 999
999n - - >= ----------
       2        2

1998n - n >= (1000 * 999)

      1000 * 999
n >=  ----------
         1997
You want at least 501 items.

(Of course I've made a mistake somewhere).
I love deadlines. I especially love the swooshing sounds they make as they fly by. (Douglas Adams)

admin
Site Admin
Posts: 65185
Joined: 22 May 2004 16:48
Location: Win8.1, Win10, Win11, all @100%
Contact:

Post by admin »

agnul wrote:You want at least 501 items.

(Of course I've made a mistake somewhere).
There must be a mistake because with 501 itemsthe chance to draw 1000 without repetition is obviously zero! :wink:

agnul
Posts: 51
Joined: 20 Jul 2006 09:21
Location: Milan, Italy
Contact:

Post by agnul »

There must be a mistake because with 501 itemsthe chance to draw 1000 without repetition is obviously zero!
Another bad memory for me, now I hate you! :mrgreen:

Note to self: you don't sum probabilities, you multiply them. :oops:
I love deadlines. I especially love the swooshing sounds they make as they fly by. (Douglas Adams)

admin
Site Admin
Posts: 65185
Joined: 22 May 2004 16:48
Location: Win8.1, Win10, Win11, all @100%
Contact:

Post by admin »

jacky wrote:My last state is: (x being the number we're looking for (things in pool))

Code: Select all

(x-1)(x-2)(x-3)...(x-998)(x-999) = 0.5x
Yep, I'm sure now that this is the right way to approach it. I coded a little algo that approaches the result by a trial-and-error method and come to the following result: 720,960 (probability 0.50000028056652).

EDIT: forgot to mention that the formula must look like this:

Code: Select all

(x-1)(x-2)(x-3)...(x-998)(x-999) = 0.5 x^999 
Thanks, case closed.

admin
Site Admin
Posts: 65185
Joined: 22 May 2004 16:48
Location: Win8.1, Win10, Win11, all @100%
Contact:

Post by admin »

admin wrote:Thanks, case closed.
Over at the alt.sci.math.probability NG finally somebody posted a formula:

Code: Select all

The answer for s > 1 is between s.(s-1)/2/ln(2) and s.s/2/ln(2)
With s = 1000 -> between 720,626 and 721,347. My result (achieved by another way) of 720,960 was correct.

Post Reply