GameMaker Arrays in event are also copied if changed?

samspade

Member
I have arrays that hold arrays and when pulling out the sub arrays and modifying them inside of the step event (and other events) I found that I was experiencing the copy behavior which happens when you pass an array by reference to a script.

Simple example to illustrate:

Code:
///step event (or other event)
array = [[0, 1], ["hello", "world"]];

var sub_a = array[0];
var sub_b = array[1];

sub_a[0] = 1;
sub_a[1] += 1;
sub_b[0] = "goodbye";
If you run this in the debugger, you can see that sub_a and sub_b are the same as the internal arrays when first copied but as soon as you modify the values, the array they point to changes - so you are not modifying the internal array.

However, if you do this:

Code:
///step event (or other event)
array = [[0, 1], ["hello", "world"]];

var sub_a = array[0];
var sub_b = array[1];

sub_a[@ 0] = 1;
sub_a[@ 1] += 1;
sub_b[@ 0] = "goodbye";
Then you do modify the internal arrays. This will also happen with instance and global scoping.

This is expected behavior if in a script, but in reading the manual I don't see anywhere where it says this will happen inside of a event.

Is this expected behavior for events? And if so is it in the manual and I'm just missing it?
 

TsukaYuriko

☄️
Forum Staff
Moderator
Where does the manual mention that this applies to scripts? I suspect that section might be awkwardly worded, because this behavior applies globally (non-accessor writes trigger copy on write).
 

samspade

Member
Where does the manual mention that this applies to scripts? I suspect that section might be awkwardly worded, because this behavior applies globally (non-accessor writes trigger copy on write).
Just like normal variables, you can pass arrays through to scripts to be used and then returned to the instance that called the script. To do this, you simply have to specify the array variable (no need for each of the individual entries, nor the [] brackets) and the array will be passed by reference into the script. However, should you change any of the array values, the array will be copied into a temporary array just for the script. Note the use of the word temporary here! You are not actually passing the array itself into the script (as you would a variable), but instead you are requesting that the script create a copy of this array, which you will change in the script. This means that you must always return the array from the script if you wish to change any array values.

NOTE: Due to the way that this works internally, passing arrays to scripts may affect performance, especially if the array is very large. So use this functionality with care!


Arrays also have their own accessors which works in a similar way as those listed above for data structures. however array accessors have an interesting property and that is to permit you to modify an array from a script without having to copy it. When you pass an array into a script, it is passed by reference, meaning that the array itself isn't being copied into the script but rather it is simply being referenced to get the data. Normally if you then need to change the array, it would be copied to the script and then you would need to pass back the copied array for the original array to be updated. This can have costly processing overheads, and you can use the accessor instead, as that will change the original array directly without the need for it to be copied. You can see how this works in the examples below.

Both of those statements imply this behavior only applies in scripts.

And maybe I'm missing something but if non-accessor writes trigger copy on write why doesn't this trigger a copy?

Code:
array = [[0, 1], ["hello", "world"], "test"];
array[2] = "tested";
Experimenting a little:

Code:
array_a = ["test"];
array_b = array_a;
array_a[0] = "tested";
array_b[0] = "really tested";
This will create a single array, assign its value to both array_a and array_b. changing array_a[0] will modify the value of array_a to be a new array. however then modifying the value of array_b[0] does not create a new array.

So GM must track how many references there are to a single array and if there is more than one, writing to that array (without the accessor) copies the array and creates a new one. If there is only one, it doesn't.
 
Last edited:

TsukaYuriko

☄️
Forum Staff
Moderator
I believe the intention here is to highlight that, when passed to scripts, a mere reference to the passed array is stored in the argument variable, the original of which can then be accessed via accessor if one wishes to modify the original array rather than to create a copy of it - in contrast to not passing the array but just using it directly, as any script automatically runs under the scope of the instance that calls it, and therefore has that instance's variable, including the original array - not that this behavior only applies in scripts.
 

samspade

Member
I believe the intention here is to highlight that, when passed to scripts, a mere reference to the passed array is stored in the argument variable, the original of which can then be accessed via accessor if one wishes to modify the original array rather than to create a copy of it - in contrast to not passing the array but just using it directly, as any script automatically runs under the scope of the instance that calls it, and therefore has that instance's variable, including the original array - not that this behavior only applies in scripts.
Except that it can't just be a mere reference or you wouldn't get the behavior described in my second example where modifying the 'original' variable copies the array and modifying the 'second' variable does not when it is the only remaining reference. I think gml is just tracking how many references there are to an array and if there is more than one (which would happen when passing the reference through an argument) only then copy on write. Which I think I can show with the following:

Code:
//step event
array_a = ["test", "hello world"];
array_b = skip_copy("array_a", array_a, "tested");

//skip_copy(variable_name_as_string, array, string)
variable_instance_set(id, argument0, 0);
argument1[0] = argument2;
return argument1;
If you run this in the debugger you can see that this will NOT copy array_a to a new array. It will directly modify the original array_a despite passing it into a script and return that array. I think this only makes sense because after using variable instance set to remove the 'second' reference to the array so that the argument variable is now the only reference to the array, the array can be modified without triggering the 'copy on write' behavior that theoretically exists inside scripts.

Given that arrays are garbage collected it seems likely that GML is tracking how many references to them exist and using this behavior for the copy on write - which seems like it should really be renamed to 'copy on write only if more than one reference to the array exists'.
 
Top