Randomness and distribution

Started by Blizzard, June 14, 2017, 02:06:33 pm

Previous topic - Next topic

Blizzard

June 14, 2017, 02:06:33 pm Last Edit: June 14, 2017, 02:07:42 pm by Blizzard
Those of you who worked with C or C++ probably know that on Win32 systems that the macro RAND_MAX is only 32,767 (0x7FFF). This causes problems if you want random values higher than that since due to the distribution, there are values that you will never get.

e.g. You want 1,000,000. 1,000,000 / 32,767 ~= 30.52. That means that you can get a 0 or a 31, but never anything between these two numbers.

Now, you could take two numbers and sum them up or multiply them, but that breaks the equal distribution of numbers and some values will happen more often than others. It's easiest illustrated with a few small values. Let's take a range from 0 to 2 inclusive. All possible combinations:

0+0, 0+1, 0+2
1+0, 1+1, 1+2
2+0, 2+1, 2+2

If you take a close look at all possible combinations, you will notice that the results are not equally probable. 0 and 4 appear only once, 1 and 3 appear both twice and 2 appears 3 times as a result.

The only way to really handle this issue is to combine the values in a way that preserves the equal distribution. If we simplify the entire thing and say it's only possible to generate 1 bit randomly, we can get the values 0 or 1. Now if we do this procedure N times, we can get a number that consists of N bits and all numbers generated in such a way will always be equally likely to appear and the distribution will be preserved. Since we already have 15 bits (32,767) at our disposal to be generated, we can just generate this value twice and then just combine them by shifting one of the numbers to have a 30 bit value.

I implemented this in our foundation library, you can take a look here at line 46 and 47 (or just find "#define HRAND()" if the code has changed since this post was made): https://github.com/AprilAndFriends/hltypes/blob/master/src/hltypesUtil.cpp
I do casting to int64_t, because I have multiplication and division with another 32-bit int so nothing breaks. Incidentally I think this is the same reason why Microsoft limited RAND_MAX to 32,767 in the first place.
Check out Daygames and our games:

King of Booze 2      King of Booze: Never Ever
Drinking Game for Android      Never have I ever for Android
Drinking Game for iOS      Never have I ever for iOS


Quote from: winkioI do not speak to bricks, either as individuals or in wall form.

Quote from: Barney StinsonWhen I get sad, I stop being sad and be awesome instead. True story.