I need a second opinion about evaluating floating point values

In 2012 before I came to GMS, I was programming in C and I had problem with evaluating floating point values. So I researched the problem by looking at other peoples postings in another website called stackoverflow.com and then posted a question to a newsgroup called alt.comp.lang.c ( see https://groups.google.com/forum/?hl=en#!topic/alt.comp.lang.c/iXVHvnhM714 ). Please beware of my grammar problems when I wrote this ( I am aware of them ).

To quote what someone told me ( see the link to see what my original post is - note its uses C terminology) , this is a truncated copy :

Barry Schwarz

9/24/12
Most simple decimal fractions cannot be expressed exactly in the
floating point representation used by most computers. In the
mathematical world, 23.0010 - 23 will always equal 0.0010 exactly. In
the world of computers, as exemplified by my system, the difference is
0..0010000000000012. Needless to say, this won't compare equal to
0.0010.

Since comparing for exact equality is not productive, there are a
couple of options:

If you are only interested in a fixed number of decimal places
(such as four as implied by your example), you could multiply all the
values by the appropriate power of 10 (10,000 for four), and use only
the integer portion of the products. So instead of comparing y to
0.0010 you would compare (int)(y*10000) to 10.

The more general solution is to check, not for equality, but for
"closeness". Instead of y==0.0010, try abs(y-0.0010)<1e-9. That
epsilon value (1e-9 in this case) actually needs to be scaled based on
the magnitude of the comparands. If you were comparing numbers in the
range of c (the speed of light in cm/sec), you might use an epsilon of
1e3 instead. The scaling is also affected by the number of
significant digits in your values which in turn is also impacted by
your C compiler and run-time library.

....
....
Scientific notation, and every other form of visible representation
such as decimal, binary, hexadecimal, octal, etc, apply only to how
people see and enter values. It has nothing to do with how the value
is stored or manipulated in the computer. All floating point values
are rounded to (technically approximated by) the closest value that
can be stored exactly. 1e-3 and 0.0010 are the exact same value as
far as your system is concerned. And neither is stored in the
computer as exactly that value. (I assume your 10e-3 was just a "slip
of the tongue".) So yes, y==1e-3 would evaluate to false exactly for
the same reason that y==0.0010 does.
actually the value I meant was 1(10e-3) which means 10**-3 ( note : " ** " means raise to the power of ) - but whatever....

Would I have the same problem in GML for evaluating exact values that are floating point values, as I did in C programming, as the guy stated in the response ? Would I have to resort to the same tactics in used C programming, but instead in GML programming - for evaluation of floating point values ? Is he correct in his response?

Thanks in Advance
 

Simon Gust

Member
Yes, you're going to run into problems, since GM's default type are doubles.
I can only advise you to just code in a way that doesn't require you to compare to direct values.
I can't think of situations either.

What we have in our disposal ist macros and enums. They can be cast to ints I believe.
Code:
enum item
{
  none = 0,
  sword = 1,
  axe = 2,
  // ...
}
Any number greater than 32 bits, will be cast into an int64.
You can also temporarily cast a number by using int64() as a function.

Do you have an example of a situation where you run into problems?
 

jo-thijs

Member
In 2012 before I came to GMS, I was programming in C and I had problem with evaluating floating point values. So I researched the problem by looking at other peoples postings in another website called stackoverflow.com and then posted a question to a newsgroup called alt.comp.lang.c ( see https://groups.google.com/forum/?hl=en#!topic/alt.comp.lang.c/iXVHvnhM714 ). Please beware of my grammar problems when I wrote this ( I am aware of them ).

To quote what someone told me ( see the link to see what my original post is - note its uses C terminology) , this is a truncated copy :



actually the value I meant was 1(10e-3) which means 10**-3 ( note : " ** " means raise to the power of ) - but whatever....

Would I have the same problem in GML for evaluating exact values that are floating point values, as I did in C programming, as the guy stated in the response ? Would I have to resort to the same tactics in used C programming, but instead in GML programming - for evaluation of floating point values ? Is he correct in his response?

Thanks in Advance
1(10e-3) does not mean 10**-3, it means 10 * 10**-3, which is 10**-2.
The notation AeB indicates the number A * 10 ** B.
Writing AB is often used as a shorthand notation for A * B, so you actually wrote 1 * (10 * 10 ** -3).

Now, as Simon Gust already said, GameMaker uses the same floating point numbers as C,
so the same phenomena occur in GameMaker as well.
However, GameMaker uses a different comparison method (for ==, <=, >=, < and >) from C.
GameMaker uses fuzzy comparison (which is the more general "closeness" solution mentioned in your quote).
You can set the value for the epsilon using math_set_epsilon and you can get this value using math_get_epsilon.

And yes, the response you got is correct.
 

YellowAfterlife

ᴏɴʟɪɴᴇ ᴍᴜʟᴛɪᴘʟᴀʏᴇʀ
Forum Staff
Moderator
Would I have the same problem in GML for evaluating exact values that are floating point values, as I did in C programming, as the guy stated in the response ?
GML has epsilon support built in, which has it that if the two values are sufficiently close (you can change the threshold), they are considered equal. This way, while the values themselves will still have floating point errors (and you can verify this via string_format), the logic will remain largely unaffected, as all functions can disregard errors within a reasonable margin.
 
GML has epsilon support built in, which has it that if the two values are sufficiently close (you can change the threshold), they are considered equal. This way, while the values themselves will still have floating point errors (and you can verify this via string_format), the logic will remain largely unaffected, as all functions can disregard errors within a reasonable margin.
Well that sounds like the solution that I did for my evaluation problem, was to take the floating point value and turn it into a string via sprintf(), and then compare the two, by using a string literal in strcmp() for C programming. For instance, is if you have a floating point value called cash ( which equals 2.13 ) you would write strcmp(cash,"2.13"); . It works - but its like getting away with murder to the community of programmers on the C or C++ programming newsgroups that I used to post my questions on. They will tell you how it SHOULD be done, like that one guy did in my post that I shared here. Ive never met a group of people before, who were so stern and anal - to go in depth in a feedback, about my programming methods needing to adher to the ISO / IEC / ANSI standards of C or C++ , in that community of programmers. But its was the best help I that I found that guided me in my learning process.
 

jo-thijs

Member
Well that sounds like the solution that I did for my evaluation problem, was to take the floating point value and turn it into a string via sprintf(), and then compare the two, by using a string literal in strcmp() for C programming. For instance, is if you have a floating point value called cash ( which equals 2.13 ) you would write strcmp(cash,"2.13"); .
I think you misinterpreted what YellowAterlife said.

What they said was:

1) GameMaker already applies the "closeness" trick for you.
If you do this in GameMaker:
Code:
var X = 23;
var Y = 23.0010;

if (X == Y) {
    ...
}
it would corespond to this in C:
Code:
double X = 23.;
double Y = 23.0010;

if ( fabs(X - Y) <= eps ) {
    ...
}
with "fabs" defined in the math.h library
and "eps" is the "closeness", which is 0.00001 by default in GameMaker.

2) Values in GameMaker have rounding errors as well.
You can observe this fact by using string_format as follows:
Code:
var X = 23;
var Y = 23.0010;
show_message("X: " + string_format(Y - X, 20, 16) + "  Y: " + string_format(0.0010, 20, 16));
You'll see 2 different numbers, even though with infinite precision, the numbers should have been the same.

YellowAfterlife was not saying to use string_format to compare values with each other.

It works - but its like getting away with murder to the community of programmers on the C or C++ programming newsgroups that I used to post my questions on. They will tell you how it SHOULD be done, like that one guy did in my post that I shared here. Ive never met a group of people before, who were so stern and anal - to go in depth in a feedback, about my programming methods needing to adher to the ISO / IEC / ANSI standards of C or C++ , in that community of programmers. But its was the best help I that I found that guided me in my learning process.
I just read through the entire discussion you linked to.
I actually didn't notice any stern or anal replies.
I also haven't seen any replies saying your programming method needs to adher to a specific standard.
You may have been talking about other discussions you didn't link to as well, in which case you might be right.

You sound like you're still a bit confused as to how floating point numbers work
and why your code using sprintf and strcmp doesn't generally work.
If so, feel free to ask for an explenation.
 
I think you misinterpreted what YellowAterlife said.

I just read through the entire discussion you linked to.
I actually didn't notice any stern or anal replies.
I also haven't seen any replies saying your programming method needs to adher to a specific standard.
You may have been talking about other discussions you didn't link to as well, in which case you might be right.

You sound like you're still a bit confused as to how floating point numbers work
and why your code using sprintf and strcmp doesn't generally work.
If so, feel free to ask for an explenation.
Yes I am confused as to how floating point numbers work, it was never made clear to me that floating point values are not stored exactly as we read them or think of them in math. That was the reality check that stopped me in my tracks.

This also might not be related...

Becuase I program in C ( as to avoid C++, which there is a reason - related to this topic ), Ive notice that GML does not use something called streams that are used in C or C++. What does GML use for the standard output , input, and error streams ?

And one of the reasons that I do not program in C++ is because you have to do more work with printing floating point values in a format you want, than in C when your using the cout stream. In C, the convenience is using printf with the % specifiers, which lets you determine how many digits are displayed and the format of the digits with decimal that you want. But in C - the same stream is called stdout.

Secondly, in regards to C++ I didn't understand how to build a dynamically linked node list. Its something I will have to relearn , again. I was going to try building a dynamically linked node it in C, but malloc() and free() are totally different in their syntax from new and delete in C++. Its possible to do it, but its easier said than done using C. Can you do dynamically linked node lists in GML ( unless something better exists ) ?

So because GMS exists for linux ( but not for red hat's fedora core versions, which is what I was using ), can you use a game executable ( made by GMS ) with redirection " << , < , > , >> " and pipes" | " via bash shell scripting?
 

jo-thijs

Member
Yes I am confused as to how floating point numbers work, it was never made clear to me that floating point values are not stored exactly as we read them or think of them in math. That was the reality check that stopped me in my tracks.
A computer has a memory to work with.
This memory is just a huge fixed-length sequence of bits.
A bit can be in 1 of two states: it can be 1 or it can be 0.
As a result, at any point in time, the computer memory is just a huge sequence of 0s and 1s.

If a computer wants to remember an integer, it needs to reserve some bits in its memory to represent this number.
For small numbers, reserving 8 bits might suffice.
We now need to represent a number as a sequence of 8 bits.
One way to do this, is to use the binary notation of the number as representation, padded with 0s to the left.
For example, 105 would be represented as 0111001.
Using this representation, we can represent any integer in the range 0-255, but only these numbers.

Now, you might be interested in using numbers with decimals.
You'll again need to reserve some bits in memory to represent this number.
Let's say we reserve 32 bits for it, then how can we use these 32 bits to represent such a number?
One way to do so, is to use a "fixed-point" representation for the numbers.
You would again use a binary representation for your number.
For example, 20.3 is in binay notation: 10100.01001 1001 1001 1001 1001 ...
The idea of a fixed point notation is that we only remember a fixe amount of binay digits in front of and behind the dot.
For example, we can decide to use 16 of the 32 bits to represent the integer part and the other 16 bits to represent the decimals.
The problem is however, that we can't represent 20.3 in this way, as we would need more than 16 bits (we need infinitely many bits) to represent it.
We solve this by not using 20.3, but the closest number to 20.3 that we can represent using our "fixed-point" notation.
In other words, we round towards 20.3 in binary representation to something we can represent.
So, we use the representation (already padded) 0000 0000 0001 0100 . 0100 1100 1100 1101
which corresponds to the number 20.3000030517578125.
There is still 1 issue in this representation: the fact that we use a dot to separate the integer part from the decimals and we can only use 1s and 0s.
Now, because we have a fixed amount of bits to represent the integer part and a fixed amount of bits to represent the decimals,
the dot will always appear at the same location (after the 16 bits used for the integer part).
As a result, we can just leave the dot away to get the following representation: 0000 0000 0001 0100 0100 1100 1100 1101.
Note that this is the reason why it is called "fixed-point" representation, because the dot always appears at the same location.

Using a fixed-point representation for numbers often leads to problems however.
For example, if you're a physicist, you might be working with both huge and tiny numbers at the same time.
For example, when working with gravity, you got a tiny gravitational constant, but huge masses (of planets).
You could use a different fixed-point representation for every number in your program,
but that'd end up becoming hugely impractical and it would make development of highly performant hardware more difficult.
You could also use a single fixed-point representation for all numbers,
but then you'd need a huge amount of bits to represent a single number and you'd be hugely wasting your memory.
Another example of a problem with fixed-point representations is that you're working with absolute errors instead of relative errors.
This means that if you design an algorithm, it be highly inaccurate (you can say worthless) for small numbers and be way more precise for large numbers than you needed it to be.
Clearly, another representation is needed and this is where "floating-point" representations come in.
They're based on the scientific notation for numbers where you write a significand (also known as mantissa) with a fixed precision, followed by a multiplication with a base to the power of an exponent.
For example: 20.3000 * 10^6 to represent 20300000 with an error of at most 50.
If we know which base is used out of context (most of the times 10), then we also write this as 20.3000 e 6 (where e stands for "with exponent").
We now use this same principle to represent numbers by a sequence of bits.
We first decide to use a fixed base and we choose as base 2 instead of 10.
This is because a base of 2 is much easier for a computer to work with.
We then represent the significand (20.3000 in the above example) using a fixed-point representation.
This means we choose a fixed amount of bits to represent the integer part and a fixed amount of bits to represent the decimals and no bits to represent the dot.
We then also represent the exponent (which is an integer) in the remaining bits.
For example, trying to represent 20.3e6 with 8 bits for the integer part, 8 bits for the decimals
and 16 bits for the exponent could look like this: 0001 0100 . 0100 1101 e 0000 0000 0000 0110
and we just leave away the dot and e for the final representation to obtain: 0001 0100 0100 1101 0000 0000 0000 0110.
The reason this is called a "floating-point" representation is because the exponent can move where the dot in non-scientific notation actually occurs with respect to the significand.
As a result, the dot "floats", meaning it moves.

There are still some issues with this representation however.
For example, we currently don't support negative numbers or negative exponents.
There are multiple ways to fix this, but usually the following approach is taken.
We reserve 1 bit to represent the sign of the number.
This bit is 0 for positive numbers and 1 for negative numbers.
We don't represent the sign of the exponent, but we use something called a "biased representation".
In a biased representation with bias B, you don't use the binary representation of your number to encode it in bits,
but you use the binary representation of your number + B to encode it in bits.
For example, to encode -11 in 8 bits with a bias of 128, it would look like this: 0111 0101, because this is the binary representation of 117.

Now, there exist infinitely many floating-point representations,
but when people refer to floating-point representations, they almost always refer to the IEEE standard.
This standard defines 2 main "floating-point" formats: single pecision and double precision.
Single precision uses 32 bits in total, 1 to represent a sign, 23 to represent the significand (excluding sign) and 8 to represent the exponent.
Double precision uses 64 bits in total, 1 to represent a sign, 52 to represent the significand (excluding sign) and 11 to represent the exponent.
In both cases the significand uses 0 bits to represent the integer part of the significand and all the bits for the decimals.
In C, the type "float" almost always corresponds to single precision IEEE floating-point numbers
and the type "double" almost always corresponds to double precision IEEE floating-point numbers.
The IEEE standard actually does a little bit more in their floating-point representation than I have explained above.
They make sure to optimize the precision, amount of representable numbers and efficiency for a computer to work with such numbers in their exact formats.
They also include 3 special values in their format that floating-point numbers can represent: +Inf, -Inf and NaN.
The value +Inf indicates the occurence of a positive overflow, meaning you're trying to represent a positive number which is too large to represent with the given amount of bits.
The value -Inf is analogous for a negative overflow (this can also occur for opeations like -1 / 0).
The value NaN means "Not a Number" and occurs if you perform an operation where your computer doesn't know what to do with, like 0 / 0.
For more information on how the IEEE standard works exactly, have a look at:
https://en.wikipedia.org/wiki/IEEE_754

So, knowing all of this, the reason they're not stored exactly as we read them is because they're stored in binary notation, to optimize memory usage.
The reason they don't work exactly like real numbers in math is beause of memory limitations.
We've only got a finite amount of bits to represent them (and to optimize hardware, we only use a fixed amount of bits),
which doesn't suffice to represent all real numbers.

Becuase I program in C ( as to avoid C++, which there is a reason - related to this topic ), Ive notice that GML does not use something called streams that are used in C or C++. What does GML use for the standard output , input, and error streams ?
The philosophy of GameMaker programs is more oriented towards having a game run in a self contained window,
reading inputs (loading settings and savegames) and writing outputs (saving settings and savegames) to local files
that are either hard coded in the program or are provided through an interface inside the program itself as opposed to being passed to it through a terminal.

You can still use parameter_string and parameter_count to get parameter input from the terminal.
You can use extensions that deal with input and output streams.
The function show_debug_message might be writing strings to the error or output stream (I'm not sure).
However, this isn't really something GameMaker deals with.
GameMaker deals with using the screen and audio for output and keyboard presses and mouse movement as input.
I feel like file handling comes closest to what you're seeking for:
GM:S 1.4: https://docs.yoyogames.com/source/dadiospice/002_reference/file handling/index.html
GM:S 2: https://docs2.yoyogames.com/source/_build/3_scripting/4_gml_reference/file handling/index.html

And one of the reasons that I do not program in C++ is because you have to do more work with printing floating point values in a format you want, than in C when your using the cout stream. In C, the convenience is using printf with the % specifiers, which lets you determine how many digits are displayed and the format of the digits with decimal that you want. But in C - the same stream is called stdout.
printf from the cstdio should work in C++ as well.
I just tested this:
Code:
#include <cstdio>
#include <iostream>
#include <string>

using namespace std;

main() {
    string s = "Did you know?";
    
    printf("10 / 3 is approximately %.5f. %s\n", 10.0 / 3.0, s.c_str());
    cout << "I think you did.\n";
    
    return 0;
}
and it woks perfectly fine for me.

Secondly, in regards to C++ I didn't understand how to build a dynamically linked node list. Its something I will have to relearn , again. I was going to try building a dynamically linked node it in C, but malloc() and free() are totally different in their syntax from new and delete in C++. Its possible to do it, but its easier said than done using C. Can you do dynamically linked node lists in GML ( unless something better exists ) ?
The way you created linked lists in C will probably be almost exactly the same as in C++.
How do you create linked lists in C?

As for linked lists in GML, you could do that (using some special array or object construction), but it is rarely done.
In GML you would probably end up using a ds_list instead.

So because GMS exists for linux ( but not for red hat's fedora core versions, which is what I was using ), can you use a game executable ( made by GMS ) with redirection " << , < , > , >> " and pipes" | " via bash shell scripting?
Redirection and pipes ae dealt with by the terminal / OS itself.
Whether this works in GML is equivalent to your earlier question about whether input and output streams work in GML.
You can make it work in GML, but it's not quite something GML deals with.
 
M

MishMash

Guest
As far as practical effects go, generally doing comparisons is not too bad, given you will likely be working with doubles, the margin of imprecision is rather small and encapsulated within the "Epsilon". That is, in practise, if you have a floating point comparison:

A == B

and both are supposed to be 10.0, the actual comparison done can be mathematically represented as A >= B-Epsilon && A <= B+Epsilon, where Epsilon is a value representing the upper bound for floating point error. For practical purposes, floating point numbers are generally fine, however I have run into a few issues namely with using them in admittedly dodgy areas such as array access.

I had an optimisation which cheated a little and used a bit of math to determine the index to look up in an array. The specific example worked as follows:
array_access[ floor( val) ];

where val was calculated to be 8001.2, and I wanted it to access element 8001 of the array. Interestingly, this caused a massive bug in our game because the actual process happening was as follows:

floor(8001.2) = 8000.9999999999; // This is because floor returns a floating point representation of the number, not an integral one.
array_access[8000.999999] does integer conversion and once again floors, but returns array_access[8000]. This meant that the wrong value was being returned and a whole plethora of wonky results were coming back. It took me ages to spot this, and was starting to drive me crazy. What was funny is that this did behave differently in YYC and regular windows export, i imagine due to how the internal representation is transferred between function calls, though given that the code was a bit wonky, i wouldnt expect the outcome to be well defined.
In practise, there were two ways of fixing this: #1 using the int64() function to force conversion before accessing in place of floor. #2, just put the floating point value directly into the array, as GM does that conversion internally anyway.

So what I would say to this is be careful when using floating point values as part of an array access, whether it be dividing position by a set value to index a cell on a grid. Whilst it seems like bad programming practise, it's actually better to NOT use floor and just have the floating point result directly, if you want the correct floored value every time.
You can alleviate this problem by working with integer conversions if working on a power of two grid, for example its quite common to have something like:

grid[# floor(x/16), floor(y/16)];

In this case, i'd replace it with either grid[# x/16, y/16] directly, or grid[# x>>4, y>>4]; to get an integer converted version. They can look a little more messy, but they do not exhibit as many problems. That is not to say that this is great practise by any means, but sometimes in a game where you have a floating point representation for position (as a result of having a need for floating point speeds), it is the best solution to an un-ideal problem problem.
 

jo-thijs

Member
floor(8001.2) = 8000.9999999999; // This is because floor returns a floating point representation of the number, not an integral one.
Every integer that is small enough in absolute value (and 8001 is definitely small enough) can be represented exactly as a floating point number.
To my knowledge, floor should always return exact results.
I would have no idea how you go that result.
Maybe a bug in some version of GameMaker?
Maybe it depends on the machine it runs on?

I just tested it and I didn't get this behavior.
 
M

MishMash

Guest
Every integer that is small enough in absolute value (and 8001 is definitely small enough) can be represented exactly as a floating point number.
To my knowledge, floor should always return exact results.
I would have no idea how you go that result.
Maybe a bug in some version of GameMaker?
Maybe it depends on the machine it runs on?

I just tested it and I didn't get this behavior.
Yeah not sure how the bug happened to be honest, all I know is that it definitely was happening, as I was printing out the end values (using string format) before passing them into the array (after conversion to int). The actual values I quoted here may not be entirely accurate, it was a few months ago as well. I entirely agree with the definition that a float should be representable (as you should be able to just have an exponent of nothing).

I tried re-creating it just now and can't, would have to dig out the older version of a project to find it.
 
I apologize for the late reply....

A computer has a memory to work with.

So, knowing all of this, the reason they're not stored exactly as we read them is because they're stored in binary notation, to optimize memory usage.
The reason they don't work exactly like real numbers in math is beause of memory limitations.
We've only got a finite amount of bits to represent them (and to optimize hardware, we only use a fixed amount of bits),
which doesn't suffice to represent all real numbers.


The philosophy of GameMaker programs is more oriented towards having a game run in a self contained window,
reading inputs (loading settings and savegames) and writing outputs (saving settings and savegames) to local files
that are either hard coded in the program or are provided through an interface inside the program itself as opposed to being passed to it through a terminal.

You can still use parameter_string and parameter_count to get parameter input from the terminal.
You can use extensions that deal with input and output streams.
The function show_debug_message might be writing strings to the error or output stream (I'm not sure).
However, this isn't really something GameMaker deals with.
GameMaker deals with using the screen and audio for output and keyboard presses and mouse movement as input.
I feel like file handling comes closest to what you're seeking for:
GM:S 1.4: https://docs.yoyogames.com/source/dadiospice/002_reference/file handling/index.html
GM:S 2: https://docs2.yoyogames.com/source/_build/3_scripting/4_gml_reference/file handling/index.html


printf from the cstdio should work in C++ as well.
I just tested this:
Code:
#include <cstdio>
#include <iostream>
#include <string>

using namespace std;

main() {
    string s = "Did you know?";
  
    printf("10 / 3 is approximately %.5f. %s\n", 10.0 / 3.0, s.c_str());
    cout << "I think you did.\n";
  
    return 0;
}
and it woks perfectly fine for me.
This part makes sense but then you have to read what your value is , according to how the computer writes it to determine two other values
as a high and low value to prove that your value is equivalent in a conditional that's in the middle of those two values.

The way you created linked lists in C will probably be almost exactly the same as in C++.
How do you create linked lists in C?
Ive never used linked lists in C. It might be near impossible. My best guess is, you have to create two identical structs ( with the invention of typedef ), one is used for creating a pointer type, that is used in the second struct which is a copy, which might not work. Then you have to deal with malloc() and free(), in the form of a recursivel function. That recursive function has to have a static variable that counts how many times you use it so that when you go from the tail back to the head of the linked node list , all the memory locations are freed. You have to use another function to traverse the node list.

Normally I used static arrays of structs in C. Dynamically allocated memory requires you to search through all the nodes from the head , to the tail - to find what your looking for, because dynamically allocated memory is not contiguous like an array.

However , I like your explanation of floating point values in memory. I am going to save it for an offline reference - because then I can save myself the trouble. My instruction of floating point values starts where you talk about values being converted to a scientific notation and then IEEE.

There is another problem which is not covered , which my professor at the time made a big deal about is that CPUs such as Intel use something called little-endian ( Motorola 68000 based CPUs at the time, used big-endian ) which changes the way the value is written and read from memory in binary. Now I have no idea what AMD uses or what the current standard is now. It was not mentioned if this changes the way floating point values are written and read by the computer according to IEEE. But we never got to that discussion - because our professor found the topic to be outside the scope of the class, more of a assembler topic ( like stack and heap ).

Thank you again.
 

jo-thijs

Member
This part makes sense but then you have to read what your value is , according to how the computer writes it to determine two other values
as a high and low value to prove that your value is equivalent in a conditional that's in the middle of those two values.
Comparing 2 floating point values in C will be compiled to whatever the assembly language it is compiled to supports.
If the assembly language complies with the IEEE standard, then it performs the check A == B as follows:
Is either A or B one of the values +Inf, -Inf or NaN?
If so, return false.
Otherwise, are A and B both either 0 or -0?
If so, return true.
Otherwise, compare the bit sequence of A in memory with the bit sequence of B
and return whether those 2 are equal.

As you can see, the general case is checking that both numbers have the exact same representation.
The numbers 20.001 and 0.001 can both not be exactly represented by floating point representations.
Because we use a fixed size for the significand and 20.001 has more digits before 0.001,
we will represent 0.001 more precisely than 20.001.
If we subtract 20 from 20.001, we will not magically get this precision back, so 20.001 - 20 will be represented less precisely than 0.001.
This means that they both have a different bit sequence and hence will be considered different.

Your suggestion is to convert the binary representtion of the floating point numbers to a decimal one and round off the numbers to a cetain amount of digits, then compare this representation.
This has a couple of issues.

First of all, the C code you suggested in the past (in the link you provided) contains a bug:
Code:
#include <stdio.h>
#include <string.h>

int main(void)
{
      double x = 0.0010;
      char *strd1;
  
      sprintf(strd1,"%.4f",x);

      if (strcmp(strd1,"0.0010") == 0)
      {
          printf("hello world!\n");
      }

      return 0;

}
The function "sprintf" expects you to have allocated some memory for the result before you call it.
Otherwise, it will write the result over other stuff of your program or even worse, over other stuff of other programs.
You never allocated any space for the result, you just made "strd1" point to a random location in memory (the memory of your program and other programs).
So, the result of "sprintf" is stored at some unsafe location where it can mess up your and other programs.
Nowadays, computer achitecture is advanced enough to notice most of the cases where you do something like this
and will throw an exception when this occurs.
I just ran that piece of code and that's exactly what happened, I didn't get any result, but an exception instead,
indicating the program was trying to write in the memory of other programs.
However, you mentioned this worked for you, which is a bit concerning for your computer honestly.
A "solution" to this bug would be to replace this:
Code:
      char *strd1;
with this:
Code:
      char strd1[7];
so that you've allocated 7 characters for "strd1",
which is the minimal amount you require in this case.
This would still be an unsafe solution however, as you'd still get the same bug
when the number "x" needs more than 7 characters to be represented.
You could solve this by checking if "x > 9.9999 || x < -9.9999" beforehand and only use "sprintf" if it returned false.
However, that would give you a solution that only works for comparing values to constants.
To generalize for non-constant values, you could calculate the amount of characters you'll need by taking the logarithm (base 10) of your numbers and rouning upwards.
Of course, this would not work for negative numbers, for 0, for +inf, -inf and Nan, for numbers that are too small, ...
You'd also need to change the static memory allocation to a dynamic memory allocation with malloc and friends.
This is all a lot of hassle to go through and even then, you'll still be left with the other disadvantages of this method I explain below.

Now, all of the above was just about a bug in your specific implementation, but that doesn't mean your approach is bad.
Below I mention flaws of your approach in general.

First of all, when comparing 0.0 with -0.0, your approach would say they're different, while you'd expect them to be equal.
Analogously, when comparing +Inf with +Inf, -Inf with -Inf or NaN with NaN, your approach would say they're equal, while you'd expect them to be different.

Secondly, your approach is highly inefficient both time and memory wise.
Suppose you're comparing -1e1000 with 1e-100 with a prcision of 105 decimals behind the dot.
You'd need 1217 characters for a comparison, which is more than 2 kilobytes.
This while a double precision floating point value only takes up 8 bytes of memory.
This is a pretty big memory difference, but it isn't the worst yet.
The worst part is the time consumption.
First of all, you put time into calculating the amount of memory you'll need.
Then you put time into allocating this memory.
These 2 will most of the time be pretty insignificant.
However, then you require time to calculate each digit.
This requires a floating point multiplication, rounding and subtraction, an integer addition and an integer decrement and a memory write operation for evey digit,
to not mention dealing with negative values, with zeros, with infinities, with NaNs and dealing with finding the position of the dot without causing overflow or underflow in the process.
This is already quite expensive, but then you also need to compare all digit individually, until a difference is observed.
In the worst case, when both numbers are equal (which is expected to occur a lot),
this requires a memory read operation, 2 integer comparisons and an integer increment for every character,
to not mention the conditional jumps in the process.
This is a very large time consumption for an operation you'll use as often as a number equality check.

Thirdly, the result of your approach (when it produces correct results) is equivalent to the results of fuzzy comparison.
To compare 20.0010 - 20 with 0.0010, you would do this in C:
Code:
#include <stdlib.h>

int main() {
    double a = 20.0010;
    int x = (int)a;
    double y = a - (double)x;
    
    double precision = 0.00005;
    double z = 0.0010;
    
    if (abs(y - z) <= precision) {
        // They're equal
    } else {
        // They're not equal
    }
This has the advantage of also behaving correctly for 0, -0, +Inf, -Inf and NaN.
It doesn't use any memory, except for a single floating point register to keep track of intermediate results (but these registers are separate from your working memoy, so they don't count towards memoy use)
and if you implement "precision" as a variable (like I did in my example), an additional 8 bytes of memory.
Compare this to the potential 2 kilobytes of memory usage of your approach.
Timewise, a fuzzy comparison is crazy fast.
It takes only about 2 to 3 times the time of a standard C equality check (ignoring the memory read for "precision" if you implement it as a variable).
Compare this to the potential thousands and thousands of times longer your approach takes.

Ive never used linked lists in C. It might be near impossible. My best guess is, you have to create two identical structs ( with the invention of typedef ), one is used for creating a pointer type, that is used in the second struct which is a copy, which might not work. Then you have to deal with malloc() and free(), in the form of a recursivel function. That recursive function has to have a static variable that counts how many times you use it so that when you go from the tail back to the head of the linked node list , all the memory locations are freed. You have to use another function to traverse the node list.
A list is a generic datatype.
You can have a list of integers or a list of floating point numbers or a list of characters or a list of lists and so on.
In general, you can have a list of values of a datatype A.
In C, you pretty much have to implement a new linked list for every different type A you're using.
A linked list implementation could look as follows:
Code:
#include<stdlib.h>

typedef struct linked_list_of_A_node* linked_list_of_A;

struct linked_list_of_A_node {
    struct linked_list_of_A_node* next;
    A value;
};

linked_list_of_A empty_linked_list_of_A() {
    return (linked_list_of_A_node*) 0;
}

int is_linked_list_of_A_empty(linked_list_of_A list) {
    return !list;
}

int append_linked_list_of_A(linked_list_of_A* list, A value) {
    linked_list_of_A_node* new_node = malloc(sizeof(linked_list_of_A_node));
    
    if (!new_node) {
        return 1; // Failure, out of memory!
    }
    
    new_node->next = (linked_list_of_A*) 0;
    new_node->value = value;
    
    if (is_linked_list_of_A_empty(*list)) {
        *list = new_node;
    } else {
        struct linked_list_of_A_node* current_node = *list;
        
        while (current_node->next) {
            current_node = current_node->next;
        }
        
        current_node->next = new_node;
    }
    
    return 0; // Success
}

int insert_in_linked_list_of_A(linked_list_of_A* list, unsigned int index, A value) {
    linked_list_of_A_node* new_node = malloc(sizeof(linked_list_of_A_node));
    
    if (!new_node) {
        return 1; // Failure, out of memory!
    }
    
    new_node->value = value;
    
    if (!index) {
        new_node->next = *list;
        *list = new_node;
    } else if (is_linked_list_of_A_empty(*list)) {
        free(new_node);
        return 2; // Failure, out of bounds!
    } else {
        struct linked_list_of_A_node* current_node = *list;
        
        while (--index) {
            if (current_node->next) {
                current_node = current_node->next;
            } else {
                free(new_node);
                return 2; // Failure, out of bounds!
            }
        }
        
        new_node->next = current_node->next;
        current_node->next = new_node;
    }
    
    return 0; // Success
}

int prepend_linked_list_of_A(linked_list_of_A* list, A value) {
    struct linked_list_of_A_node* new_node = malloc(sizeof(linked_list_of_A_node));
    
    if (!new_node) {
        return 1; // Failure, out of memory!
    }
    
    new_node->next = (linked_list_of_A*) *list;
    new_node->value = value;
    
    *list = new_node;
    
    return 0; // Success
}

struct linked_list_of_A_node* get_node_from_linked_list_of_A_at_index(linked_list_of_A list, unsigned int index) {
    if (is_linked_list_of_A_empty(list)) {
        return (struct linked_list_of_A_node*) 0;
    } else {
        struct linked_list_of_A_node* current_node = list;
        
        while (index--) {
            if (current_node->next) {
                current_node = current_node->next;
            } else {
                return (struct linked_list_of_A_node*) 0;
            }
        }
        
        return current_node;
    }
}

unsigned int get_linked_list_of_A_length(linked_list_of_A list) {
    unsignd int result = 0;
    
    while (list) {
        ++result;
        list = list->next;
    }
    
    return result;
}

...
One way to generalize linked lists to not be bound by a specific type A is to choose the type void* for A.
This will make it so every linked list node contains a pointer to any value of any type, rather than the value itself, so the type doesn't need to be known in advance.

In C this can be generalized more elegantly by using templates.

Normally I used static arrays of structs in C. Dynamically allocated memory requires you to search through all the nodes from the head , to the tail - to find what your looking for, because dynamically allocated memory is not contiguous like an array.
Correct.

There are usually not many benefits to using linked lists.
Linked lists tend to barely save any memory when compared to dynamic arrays (assuming the values in the list are pointers)
and they tend to be slower (less cahe hits, more memory access, no indexing, ...).

There is another problem which is not covered , which my professor at the time made a big deal about is that CPUs such as Intel use something called little-endian ( Motorola 68000 based CPUs at the time, used big-endian ) which changes the way the value is written and read from memory in binary. Now I have no idea what AMD uses or what the current standard is now. It was not mentioned if this changes the way floating point values are written and read by the computer according to IEEE. But we never got to that discussion - because our professor found the topic to be outside the scope of the class, more of a assembler topic ( like stack and heap ).
Big-endian is what we're used to, we write the digit of a number with a higher wheight/significance before others.
For example, in 20.3 the 2 has a wheight of 10, the 0 has a wheight of 1 and the 3 has a wheight of 0.1, so we first write 2, then 0 and finally 3.
It's called "big-endian", because we start from the big end of the number.
Little-endian reverses this, so we would write 20.3 as 3.02.

To my knowledge, the IEEE standard doesn't specify which endianness should be used.
It would change how the values are written to and read from memory,
but you as a programmer would not notice anything from this (unless you start messing around with the individual bytes or bits of memory used by a floating point number, which you normally shouldn't do).
 
Comparing 2 floating point values in C will be compiled to whatever the assembly language it is compiled
A list is a generic datatype.
You can have a list of integers or a list of floating point numbers or a list of characters or a list of lists and so on.
In general, you can have a list of values of a datatype A.
In C, you pretty much have to implement a new linked list for every different type A you're using.
A linked list implementation could look as follows:

Code:
#include<stdlib.h>

typedef struct linked_list_of_A_node* linked_list_of_A;

.....
.....

int append_linked_list_of_A(linked_list_of_A* list, A value) {
  
    linked_list_of_A_node* new_node = malloc(sizeof(linked_list_of_A_node));
 
    if (!new_node) {
        return 1; // Failure, out of memory!
     }
 
    new_node->next = (linked_list_of_A*) 0;
    new_node->value = value;
 
    if (is_linked_list_of_A_empty(*list)) {
        *list = new_node;
    } else {
        struct linked_list_of_A_node* current_node = *list;
     
        while (current_node->next) {
            current_node = current_node->next;
        }
     
        current_node->next = new_node;
    }
 
    return 0; // Success
}

...
In C you have to cast type malloc() assignments to pointers of their type ( link_list_of_A*) , and you have to test the pointer for NULL, not false ( or 0 ) as you have done, to test if malloc() failed. malloc() returns a void type pointer that's why you have to cast type. Secondly, you can not assign 0 to a pointer type as you did for new_node, you need to use NULL.

Does GML use a "NULL" value in its language for anything ?
 

jo-thijs

Member
In C you have to cast type malloc() assignments to pointers of their type ( link_list_of_A*) ,
Correct, that's a mistake of mine.

and you have to test the pointer for NULL, not false ( or 0 ) as you have done, to test if malloc() failed. malloc() returns a void type pointer that's why you have to cast type.
For optimized compatibility, I should indeed.
However, in all cases I've ever seen, NULL is defined as a pointer cast from 0, like: "(void*) 0".
This would make comparing to false (or 0) work as well.
I didn't use NULL as I didn't feel like including a NULL definition macro in my example.
I forgot that stdlib.h included this macro already.
So, you're right, best practice would have been to use NULL.

Secondly, you can not assign 0 to a pointer type as you did for new_node, you need to use NULL.
Same as above, I shouldn't, but I can.

Does GML use a "NULL" value in its language for anything ?
This is debatable.

GameMaker has the constant pointer_null, which is effectively a NULL pointer,
but you can't use it for anything, besides passing it to and retrieving it from DLLs in extensions for GameMaker.
Otherwise, this kind of NULL pointer has no meaning or use in GML.

When referencing instances, you can reference them through an instance ID, an object ID or a keyword.
One of the keywords is "noone", which refers to the colletion of no instances.
This is in a lot of ways similar to NULL pointers.

GameMaker also has the value "undefined", which is used to indicate failure of certain functions.
For example, reading a value from a datastructure map from a key that the map doesn't contain will throw undefined (as opposed to an "invalid key" error).
This is in some ways also similar to how NULL pointers are sometimes used in languages like C.

So I guess GameMaker does have something like NULL pointers,
but they feel different in GameMaker than actual NULL pointers in C and such.
As an example to illustrate why:
when a function requests or returns a reference to instances,
the odds are a with-structure is used to iterate over all the instances that are referenced.
Using "with noone { ... }" will just not execute the code inside the block,
which is pretty much always what you want to happen anyway.
As a result, you don't have to perform checks for noone or do anything special.
When converting a reference to instances to a boolean, "noone" will become "false" and instance IDs will become "true",
which also decreases the explicit use of "noone".
You end up using "noone" almost only for initialization of variables (and even then not often) or to return as result of a script.
 
Thank you for the explanation you have been giving me. I think this is the best explanation I've ever had for understanding floating point values, to help me know how to use them in C or GML. As a result, a whole bunch of things that I have forgotten have come back to haunt me in programming C that I never needed that I am now remembering. I am praying it wont be the same for me when I am programming in GML.
 
Top