Task

Create a list of n numbers that are randomly chosen among the whole numbers from 1 to n. Guess how many *different* numbers are on your list. Check your guess. Repeat this action several times for different values of n. (Use as big an n as your calculator can handle without overflowing its memory. You should be able to handle lists up to 500 items.) Can you give a rule for making a “good guess”? Can such a rule be theoretically justified?

Plan of action and its execution

1. How to generate a list L1 of (for example) 200 numbers between 1 and 200?

randInt(1,200,200)→L1 ENTER

2. How to count the number of different items in L1?

SortA(L1):sum(0≠ΔList(L1))+1 ENTER

Remark

To understand this formula, create a short list L1, so you can see how many different numbers are on it. Then run your program in “piecemeal” fashion, looking at what happens during each step.

L1 | ENTER |

SortA(L1) | ENTER |

ΔList(L1) | ENTER |

0 ≠ΔList(L1) | ENTER |

sum(0≠ΔList(L1))+1 | ENTER |

3. Gather some data (an example)

N the length of the list of numbers chosen randomly between 1 and N;

D the number of different items on the list;

P = D/N presented as a percentage.

N: D: P:

100 62 62%

100 63 63%

200 131 66%

200 124 62%

… … …

Are you ready to make a prediction?

4. The theory behind these results

(a) The probability of missing a specific number, when you choose randomly one out of the numbers between 1 and n, is (n – 1)/n = 1 – 1/n.

(b) The probability of missing a specific number when you repeat such a choice n times is

(1 – 1/n)^{n}.

(c) For reasonably big n, (1 – 1/n)^{n} ≈ e^{-1} ≈ .3678794412

(d) Thus the *expected value* of the numbers missing from the list is approximately e^{-1}*n,

and the *expected value* of the numbers occurring on the list is

n – e^{-1}*n = (1 – e^{-1})*n ≈ .63*n.

So the ratio of D/N is often close to 63%, and all predictions slightly above 60% are reasonable.

****************************************************************************

Here is an example from the list in the screenshot above:

131/200 = .655

Lesson Index