Hacker News .hnnew | past | comments | ask | show | jobs | submitlogin

The birthday paradox simplified : if you generate n bits of random data, you can at most generate n/2 bits of random numbers before clashes start to occur. That's square root of number's range.

So if you need 1000 random numbers, generate from 1 to 1 million.



> So if you need 1000 random numbers, generate from 1 to 1 million.

If you don't check for clashes, the 50% chance of failure is too much. Probably even 0.1% is too much, so you'd need more elaborate approach.

If you do check for clashes, you can generate from 1 to 2000 with little overhead.


You can also look at the expected number of collisions instead, which is approximately the number of random numbers squared, divided by the size of the space of random numbers.

Then you can choose how many collisions to accept on average. (If the answer is zero, then it makes more sense to look at the probability of one or more collisions.)


I always assumed that intuitively... I think the number is 20 people for the birthday paradox. 20 x 20 = 400, and there are ~365 days in a year. Is that how that works?


The actual number is 23: https://en.wikipedia.org/wiki/Birthday_problem

The square root approximation works well for large numbers, but leaves out some factors that are relevant for small numbers.


I was always surprised the math maths for birthdays. Human birthdays are not random, and cluster around various dates and seasonal patterns.


Here is a statistical analysis of birthdays https://www.zippia.com/advice/most-least-common-birthdays/


Doesn't the clustering make collisions strictly more likely?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: