values represented in a double variable in GML

I need a second opinion about something I read....

I started getting into the idea of how many random numbers can be generated between 0 and 1 in a double variable in GML , so I googled it and found this article in stackoverflow.com ...

stackoverflow.com double numbers

..one of the people who responds to the op, states ( scroll down in the link to stackoverflow ):

...a random number generator generating numbers between 0.0 and 1.0 cannot in general produce all these numbers; typically it'll only produce numbers of the form n/2^53 with n an integer (see e.g. the Java documentation for nextDouble). So there are usually only around 2^53 (+/-1, depending on which endpoints are included) possible values for the random() output. This means that most doubles in [0.0, 1.0] will never be generated.

So in GML you can assign a number that can not be randomly generated in a double to a double variable, but you cant prove that it is one those numbers that can not be randomly generated between 0 and 1......since you will never see it produced by the random number generator.....

So the idea that I am understanding is, if your randomly generating numbers between 0 and 1, will you always get the same numbers values no matter how many times you run the program ( including the use of the randomize function ) in GML.....

Right???



Is this behavior true in GML even if you use the randomize function?
 

FoxyOfJungle

Kazan Games
From what I tested, no. (sorry if I got your question wrong)

It will not always be in an exact sequence like 0101010101... or 00110011000 and 00110011000....
And it will never be the same sequence, you can save the results in a file each time you open the game and compare them.
You can do a loop and save everything to a file to compare, see this:

GML:
var _filename = "binarynumbers_";
var _extension = ".txt";

var _n = -1;
do
{
    _n += 1;
}
until
!file_exists(string(_filename)+string(_n)+string(_extension))

var _f = file_text_open_write(string(_filename)+string(_n)+string(_extension));
repeat(10)
{
    var _random_bin = choose(1,0);
    file_text_write_string(_f, _random_bin);
    file_text_writeln(_f);
}
file_text_close(_f);

Result: (Opened the game 4 times)



You can also increase the value of repeat() and see how it behaves.
 
Last edited:
There will always be precision problems when dealing with finite representations of real values (e.g. floating point numbers in computers), but depending on what you want you can circumvent it rather easily. You can always generate random integers that represent your number times a given factor, for example if you take the range of 0 to 1 as a percentage then you can instead generate integers from 0 to 100. If you want more precision then generate until 1000 or 10000. Of course that this will not work all the time as it depends on how exactly you want to use these numbers.
 

GMWolf

aka fel666
It all depends on the random number generator.

If we assume that it first generates an integer between [0, INT_MAX), and then divides 1.0 by it, then you will only get a subset of all possible doubles (in increments of 1/INT_MAX).

I'm somewhat confident that 1/INT_MAX is smaller than a double's precision (Assuming a 64 bit int). So you can pretty much assume any possible double can come out of the random function.

The general rule to not compare == floating points still apply though. Don't check if two doubles are equal. Instead, check if they fall within a range.


Aside:
When dealing with known ranges (0 to 1 for example). It's often better to normalize that range and represent it as an integer (0 - INT_MAX). This gives you more precision. Although I don't know if this translates to any real gains when using GML where you don't have much control over type.
 

jo-thijs

Member
I started getting into the idea of how many random numbers can be generated between 0 and 1 in a double variable in GML , so I googled it and found this article in stackoverflow.com ...
Assuming that:
  • we're talking about double precision floating point values (in general, not tied to GML)
  • we want to generate values greater than or equal to 0, but strictly less than 1
  • we don't want to generate negative 0
  • we want every distinct generatable value to have equal odds of being generated
  • we want for any numbers A, B and D such that A >= 0, B >= 0, A+D < 1 and B+D < 1, that the odds of generating a value greater than or equal to A, but less than A+D equals the odds of generating a number greater than or equal to B, but less than B+D (or in short, we want generated numbers to be approximate a uniform distribution of real numbers in the range [0,1[)
  • we want the set of generatable values to be as large as possible while following the above rules
The amount of generatable values is exactly 2^53, where each value has the form "N / 2^53" with N >= 0 and N < 2^53.
This is as the stackoverflow response also indicates.

So in GML you can assign a number that can not be randomly generated in a double to a double variable, but you cant prove that it is one those numbers that can not be randomly generated between 0 and 1......since you will never see it produced by the random number generator.....
If you know the algorithm (or even part of the algorithm) that produces the pseudo random numbers, than you can prove it.
If you assume the randomly generated numbers are as described by the stack overflow reply, then you can also very easily determin if a value is generatable by the pseudo random generator.
If the value is NaN or negative zero, than it isn't generatable.
If the value is 1 or greater than 1, it isn't generatable.
If the value is less than 0, it isn't generatable.
If the value multiplied by 2^53 has a fractional part, it isn't generatable.
Otherwise, the value is generatable.

So the idea that I am understanding is, if your randomly generating numbers between 0 and 1, will you always get the same numbers values no matter how many times you run the program ( including the use of the randomize function ) in GML.....
It's not clear to me what you mean with this.
Do you mean that there is a finite set of generatable numbers that contains all the values that can be generated by a specific pseudo random number generator algorithm?
Well, then of course.
That's already a given by the fact that a computer only has a finite amount of memory.
However, we're speaking about 2^53 different values.
That's 9,007,199,254,740,992 different possible combinations.
If you can generate a pseudo random number every nanosecond (which would be pretty fast) and you keep generating numbers one immediately after the other and you keep producing distinct values,
then it would still take you more than 104 days to generate all those values.
I wouldn't exactly call that generating the same values every time again.

If we assume that it first generates an integer between [0, INT_MAX), and then divides 1.0 by it, then you will only get a subset of all possible doubles (in increments of 1/INT_MAX).
I think you meant "and then divide it by INT_MAX", because you wouldn't get increments of 1/INT_MAX otherwise.

I'm somewhat confident that 1/INT_MAX is smaller than a double's precision (Assuming a 64 bit int).
This statement is poorly defined.
If you mean that 1/INT_MAX is smaller than the difference between 0.5 and the smallest larger double precision floating point value, then yes, it is.

So you can pretty much assume any possible double can come out of the random function.
If you mean every possible double precision floating point value between 0 and 1, including 0 and 1, then no.
If you get close enough to 0, you get more absolute precision, allowing to represent values that do net get generated by the function.

The general rule to not compare == floating points still apply though. Don't check if two doubles are equal. Instead, check if they fall within a range.
In GameMaker, comparing using == checks for a range by default (unless GM:S 2 changed that much since I last used it).
Whether exact comparison is preferable depends on what you want to do, but in dame dev fuzzy comparisons tend to always be preferable when working with floating point precision.
I'm not sure why it is relevant in this discussion however.

Aside:
When dealing with known ranges (0 to 1 for example). It's often better to normalize that range and represent it as an integer (0 - INT_MAX). This gives you more precision. Although I don't know if this translates to any real gains when using GML where you don't have much control over type.
Transforming generated numbers can only make you lose precision.
Having larger numbers doesn't give you more precision when working with floating point precision.
If using a system like the stack overflow reply indicates, then normalizing the the value to be in range (0 - INT_MAX) wouldn't change anything in the precision or bit representation of the value, except that the exponent is a bit larger.

EDIT:
I made a mistake in my assumptions list.
A and B are generatable numbers.
Otherwise, the assumptions would not be satisfiable.
 
Last edited:

GMWolf

aka fel666
This statement is poorly defined.
If you mean that 1/INT_MAX is smaller than the difference between 0.5 and the smallest larger double precision fl
Yeah I'm thinking effective precision.
Like how precise doubles are in the range 0-1.
Yes smaller numbers have more precision. But if you have a variable between 0-1 you can only rely on it having so much precision.
If you mean every possible double precision floating point value between 0 and 1, including 0 and 1, then no.
If you get close enough to 0, you get more absolute precision, allowing to represent values that do net get generated by the function.
Yeah, again I'm thinking about the actual precision you can count on.
The way random numbers are generated is uniformly distributed anyways.
If you generated the the full range of doubles you wouldn't have a uniform distribution anymore. (Unless you had a shaping function to make sure smaller numbers are less likely than large numbers? You'd need higher precision for that I think)
Transforming generated numbers can only make you lose precision.
Having larger numbers doesn't give you more precision when working with floating point precision.
If using a system like the stack overflow reply indicates, then normalizing the the value to be in range (0 - INT_MAX) wouldn't change anything in the precision or bit representation of the value, except that the exponent is a bit larger.
I mean generate an int. And keep it as an int the whole way through.
If you are going to cast it to a float guess that is indeed not super useful.
But if you can keep it a an int then thats where I'm coming from.



In GameMaker, comparing using == checks for a range by default (unless GM:S 2 changed that much since I last used it).
Whether exact comparison is preferable depends on what you want to do, but in dame dev fuzzy comparisons tend to always be preferable when working with floating point precision.
I'm not sure why it is relevant in this discussion however.
Yeah if you are checking with an epsilon then this whole thing is a little meaningless.
It's just you usually only care about exact values of floats when you compare them with other numbers.



@Lord KJWilliams
The way you generate random numbers in GM is uniformly distributed.
So when generating numbers between 0-2 numbers between 0-1 are equally as likely as numbers between 1-2.
And it's fairly safe to assume that numbers generated between 0-1 will have the same spacing as numbers generated between 1-2.
The caveat being you will get more aliasing as your numbers get larger. But you will generate those aliased numbers more often (so it evens out).
In terms of statistics, you shouldn't need to worry about the behaviour of randomly generated numbers.
But as usual treat them as a floating point number. You can never make assumptions about their exact value or representation.
 
It's not clear to me what you mean with this.
Do you mean that there is a finite set of generatable numbers that contains all the values that can be generated by a specific pseudo random number generator algorithm?
Well, then of course.
That's already a given by the fact that a computer only has a finite amount of memory.
However, we're speaking about 2^53 different values.
That's 9,007,199,254,740,992 different possible combinations.
If you can generate a pseudo random number every nanosecond (which would be pretty fast) and you keep generating numbers one immediately after the other and you keep producing distinct values,
then it would still take you more than 104 days to generate all those values.
I wouldn't exactly call that generating the same values every time again.
That would involve having the computer record every number created to a list and have the list checked every time it created a new number before adding a new number to the list from there on after the 1st number is created. But lets say we're not creating a list of all of those distinct values. If we're not using the randomize timer ( which I assume randomize timer is using the computer's clock in its function for the seed), then the computer will assign it the same value every time to the seed. The same numbers would come up every time between 0.0 and 1 - every time I run that program, because of the seed.
 
Top