• Hey! Guest! The 40th (!!!) GMC Jam will take place between February 25th, 12:00 UTC to March 1st 12:00 UTC. Why not join in this very special anniversary jam! Click here to find out more!

GMS 2.3+ json_encode or json_stringify ?

DaveInDev

Member
Hi,
I am in the process of saving complex data (nested maps and lists) to disk and while reading the manual, in the json_encode section, I see :
"IMPORTANT! This function - while still valid - has been superceeded by the function json_stringify , and we recommend that you only use this function for legacy support."

which seems strange to me because they do not work on the same type of data.
It seems that json_encode works on array/maps + sub maps/lists and json_stringify works only on nested structs and arrays.
So if I want to save nested ds_map + ds_list, my only solution remains json_encode, with the "mark" functions, no ?
 

TsukaYuriko

🌠
Forum Staff
Moderator
Correct.

The data structure tree you are currently using has been spiritually superseded by structs and the array rehash, as they tend to be easier to work with than their data structure tree alternatives. The two functions work on different types of data, and the type of data format you're using is no longer the recommended one.
 

FoxyOfJungle

Kazan Games
I could say that ds_lists and ds_maps are almost like arrays and structs, what I mean is that the same thing you can do with ds_maps and ds_lists, you can do with structs and arrays. Study the way structs and arrays work and you will be happy. Then, use json_stringify() and json_parse().
I can guarantee that it is the best thing to migrate because after these new functions came up, I didn't want to go back, because it is worth it. Besides, you have more control and things get more organized.
 
Last edited:

DaveInDev

Member
That's not the first time I read that maps/lists can be replaced by structs/arrays, and I see the advantage of json_strinfigy, but I'm still dubitative. I must miss something.
Should every list becomes an array and every map a struct ?

An example : for the moment, I have
- a list of users, where each entry is a map { num_id_user, str_name, num_age, etc... }
- a list of missions, where each entry is a map { num_id_mission, str_name_mission, other mixed variables that described the mission }
- a list of hiscores_per_user that records every hiscore for every mission and every user. Each entry is a map, containing a list of maps
{ num_id_user : { {num_id_mission3 : hiscore_user_mission3 }, {num_id_mission1 : hiscore_user_mission1 } , etc.... }

should the map of users become a array of structs ? But then , I have not access to handy accessors, to quickly find the name of the user with id 99... Isn't it ?
same with other more complex nested array/structs like the hiscores per user and mission ?

it also seems that arrays cannot be passed in a function by reference, as maps can be, involving a heavy copy. So should we avoid using them as function parameters ?
But what if I want to write a function that deals with the hiscores of one specific mission of one specific user ? Or a function that makes calculations on one generic mission ?

By the way, is it really efficient to manipulate big arrays like lists ? When you insert an item in the middle of the array, is it as efficient as inserting in a list ? (in C++ it's not because you copy all the right part of the array...)

All this is not so clear and equivalent for me.
 
Last edited:

TsukaYuriko

🌠
Forum Staff
Moderator
should the map of users become a array of structs ?
In general: Maps become structs. Lists become arrays.

But then , I have not access to handy accessors, to quickly find the name of the user with id 99... Isn't it ?
same with other more complex nested array/structs like the hiscores per user and mission ?
It doesn't get much simpler than return users[user_id].name;... compare that to return ds_map_find_value(ds_list_find_value(users, user_id), "name"); or even return users[| user_id][? "name"];.

it also seems that arrays cannot be passed in a function by reference, as maps can be, involving a heavy copy. So should we avoid using them as function parameters ?
You got that backwards.
Maps are the ones that aren't passed by reference. You're not passing around maps, but IDs of maps, then accessing the map via its ID. Those IDs are passed by value.
Arrays, on the contrary, are passed by reference and copied on write unless otherwise specified. You can access the original array via the array accessor @.
Both can be passed to functions just fine.

By the way, is it really efficient to manipulate big arrays like lists ? When you insert an item in the middle of the array, is it as efficient as inserting in a list ? (in C++ it's not because you copy all the right part of the array...)
You can test this yourself by running a benchmark using the profiler, comparing both methods side by side.
 

DaveInDev

Member
It doesn't get much simpler than return users[user_id].name;... compare that to return ds_map_find_value(ds_list_find_value(users, user_id), "name"); or even return users[| user_id][? "name"];.
but my user ids are not sequencial. they could be :
JSON:
[
    {
        "name": "anonymous",
        "id": 0
    },  
  {
        "name": "john",
        "id": 99
    },
  {
        "name": "jim",
        "id": 155
    }
]
so if I loaded this JSON with json_parse and I want to retrieve the name of the user with ID 99, I have to parse the whole array, right ? With a generic function like :
GML:
function array_find_struct_by_key(arr,key,val)
{
    var i=0, l = array_length(arr), tmp;
   
    while(i<l)
    {
        if(variable_struct_exists(arr[i],key))
        {
            tmp = variable_struct_get(arr[i],key);
            if( tmp == val )
                return(arr[i]);
        }
        i++;
    }
   
    return(-1);
}
Maps are the ones that aren't passed by reference. You're not passing around maps, but IDs of maps, then accessing the map via its ID.
Maybe my english is bad. For me, passing an ID, is like passing a pointer, so it is "by reference" : you do not duplicate the data : you work on the same data, no ?
Anyway, looking at the debugger array IDs, it seems that, as you said, the arrays keep the same ID when they are passed inside functions (like the function below).

Arrays, on the contrary, are passed by reference and copied on write unless otherwise specified. You can access the original array via the array accessor @.
Both can be passed to functions just fine.
"are passed by reference and copied on write" : I do not get this one... If you pass them by reference(pointers) then if you modify the reference, you modify the original data, no ? So what is the need of this @ accessor ? I am used to C++ and the byref / by val difference induced by the "&" symbol, and I suspect that GMS works differently and that's why I'm confused...

If you look at my function above, will it return a struct that is a reference to the original data ? If I modify the returned struct, will it modify the original struct in the original array ? That's what my test shows anyway... And if I modify arr in the function (pushing an item at the end of the array, of changing some key in one struct), it seems that the original array (outside the function) is modified too... So what is this "copied on write" ?

example of modified function that changes the original array :
GML:
function array_find_struct_by_key(arr,key,val)
{
    array_push(arr,{lang: "de",menu_title : "title"});

    var i=0, l = array_length(arr), tmp;
   
    while(i<l)
    {
        if(variable_struct_exists(arr[i],key))
        {
            tmp = variable_struct_get(arr[i],key);
            variable_struct_set(arr[i],key,tmp+"2");
            if( tmp == val )
                return(arr[i]);
        }
        i++;
    }
   
    return(-1);
}
In this other simple example, the array is modified... And I do not use @ ... I'm lost 😭
GML:
function test(arr)
{
    var i, l = array_length(arr);
   
    for(i=0;i<l;i++)
        arr[i] *= 2;
}

arr = [0,1,2,3];
show_debug_message(arr);   // [0,1,2,3]

test(arr);

show_debug_message(arr);  // [0,2,4,6];
 

DaveInDev

Member
In the last example of my previous post, I missed something : in GMS, you'd better not name the function parameters with the same name as in the function call when they are in the same scope... That's why it worked, despite what is specified in the manual...

So infact, if I choose separate names, it works as intended, the array passed in parameter (by ref) is somehow protected and the @ is needed if you want to write into it, other wise you got a runtime error. It's some kind of a security I suppose.

GML:
function autodouble(arr)
{
    var i, l = array_length(arr);

    for(i=0;i<l;i++)
        arr[@i] *= 2; // if no @ then runtime error
}

function sum(arr)
{
    var i, l = array_length(arr);
    var tot = 0;

    for(i=0;i<l;i++)
        tot += arr[i];
       
    return(tot);
}

a = [0,1,2,3];
show_debug_message(a); // 0 1 2 3

autodouble(a);

show_debug_message(a); // 0 2 4 6

show_debug_message(sum(a)); // 12
 

DaveInDev

Member
BUT... if your array is made out of structs, you can modify the structs without using the @. That's why it was working in my function that was dealing with an array of structs....
It's a little bit clearer now, but it was not obvious to understand as this mixed byref/byval behaviour is not like any other programming language I know ;)
Infact it's mainly a byref behaviour, with a little protection to avoid changing directly the overall value of an index in the array, hence mimicing a byval behaviour.

Here is a simple exemple that changes the content of the items of array, without using the @ :

GML:
function autodoublestruct(arr)
{
    var i, l = array_length(arr);

    for(i=0;i<l;i++)
        arr[i].x *= 2;    // no need @ !!!
}

function test3()
{
    a = [
        { x: 0 },
        { x: 1 },
        { x: 2 },
        { x: 3 },
    ];

    show_debug_message(a); //  [ { x : 0 },{ x : 1 },{ x : 2 },{ x : 3 } ]

    autodoublestruct(a);

    show_debug_message(a); //  [ { x : 0 },{ x : 2 },{ x : 4 },{ x : 6 } ]
}
 
Last edited:

samspade

Member
For JSON data, I would definitely use arrays and structs over maps and lists. I think the advantages pointed out above pretty much cover why. Also remember that both maps and structs have an accessor which greatly shorten how you access, change, and set data (although looping through maps is a bit easier than looping through structs).

That said, I'm not convinced structs are to the point where they really do replace maps in all cases. At least at present, for data that doesn't need to be saved out or loaded in, I still use maps for pure data storage. They're faster than structs, at present at least, and the only real downside is you have to destroy them, which is the standard tradeoff for efficiency. Also, I think for pure data storage, they have better built-in functions.

That said, I don't think maps and lists are going away anytime soon unless YoYo actually gets structs to be as stable, fast, and with as the same built in functionality (which they might at some point - in fact, and this is pure speculation but I could see them transitioning away from all built in data structures down the road and replacing them with new versions that don't need to be garbage collect).
 

TsukaYuriko

🌠
Forum Staff
Moderator
Seems like you've mostly got the uncertainties figured out, so I'll only address what you haven't addressed by yourself already.

but my user ids are not sequencial.
That changes things a bit... but then you don't have access to "handy accessors to quickly find the name of the user with id 99" in the map version either, do you? At most, you can make the user ID the key, which still offers lower performance than an array. If speed is a concern, arrays are faster than maps, which are faster than structs as of writing.

If memory usage is not a concern and you're not planning to have a billion of these data sets, one approach would be to use an array and just leaving indices empty if there's no data to fill them with. That would deliver the best performance, and it's also the least round-about way of storing data that's indexed by a number.

In general, there's a trade-off between performance, memory usage and ease of use when determining whether to use arrays, maps or structs. Which of these should be prioritized highly depends on the circumstances of the project and the feature these are to be used in.

Maybe my english is bad. For me, passing an ID, is like passing a pointer, so it is "by reference" : you do not duplicate the data : you work on the same data, no ?
Anyway, looking at the debugger array IDs, it seems that, as you said, the arrays keep the same ID when they are passed inside functions (like the function below).
"are passed by reference and copied on write" : I do not get this one... If you pass them by reference(pointers) then if you modify the reference, you modify the original data, no ? So what is the need of this @ accessor ? I am used to C++ and the byref / by val difference induced by the "&" symbol, and I suspect that GMS works differently and that's why I'm confused...
If it is "like" something, it "is" not something. :) But you are correct in that array references are not pointers (as confirmed by is_ptr).

Whether you can modify the original data using the passed value as a reference does not make something "pass by reference". Different languages have different definitions of what they call "pass by reference", but the general idea is that if something is passed by reference, modifying the argument - as in directly modifying the argument, not using it to index something, as is the case with map IDs - modifies the original. If something is passed by value, modifying the passed argument does not modify the original.

GML arrays are somewhat of a borderline case here, as you can either modify the original or not modify the original depending on whether you use the array accessor or not when writing to an array.
For example, if you pass an array to a function parameter arr:
arr[0] = "copy"; inside of that function creates a copy of the array you passed in, assigns the reference to this copy to arr and then sets the 0th cell of the copy (and only the copy - not the original) to "copy". The original still contains whatever it contained before it was passed to the function. Two arrays exist now - the original and the copy (which will cease to exist at the end of the function call).
arr[@ 0] = "original";, on the other hand, modifies the original array that you passed into the function. arr still holds a reference to the same array as the one you passed in. Only one array exists here, and there are two variables that hold a reference to it (the original outside of the function and arr in the function).

Most of the official documentation claims that arrays are "passed by reference" regardless of this, hence the terminology I use when describing this as "pass by reference and copy on write" even though it's more along the lines of "passed as a reference".
 

DaveInDev

Member
If memory usage is not a concern and you're not planning to have a billion of these data sets, one approach would be to use an array and just leaving indices empty if there's no data to fill them with. That would deliver the best performance, and it's also the least round-about way of storing data that's indexed by a number.
yes that's finally what I did ! ;)

Thanks for all these clarifications . Very helpfull.
 
Top