ds_map_exists() question

Mike

nobody important
GMC Elder
Try yourself.... create an array.... do a loop, look up a load of the, Now use a ds_map, look up a load of them.
ds_maps are MUCH slower than simple arrays because it has to do LOADS of work to find each element, where arrays have the index at once.

Now if you start doing this, then things like grids and lists where you do sit in loops and process stuff will be adversely affected. I use grids a lot not just for maps etc but for data structures as they are slightly faster than arrays. if this changes, all this code will slow down hugely. My C64 emulator which grids it massively would be a third of the speed. every lookup hit by this - why? Because you can't clear a variable?

On the subject of this.... If your passing around list handles and don't check that it's free'd then your still asking for bugs, unless you're going to check it exists every time you use it of course, which is just a waste of CPU time.

And this isn't about being stuck in the past, it's about not wasting CPU time on stupid things that will cripple performance in a lot of cases. CPU performance with data structures is the ONLY reason I don't want to change them, it's ALL about performance and how much it would hurt.

.....a new graduated engineer can think out the box and improve things...
What a load of tosh. I've hired a lot of them, most of the ones we interview barely know how many bits are in an INT, or even what a "bit" is - I kid you not. There's also a reason I've been in this industry for 27 years, so try not be quite so insulting.


Actually...... it IS technically possible that we could (now) turn handles into 64bit numbers (rather than a 32bit INT), as the runner now has proper support for them (I think), and we could store the index in the lower 32bits, and a unique number in the upper 32bits. Then it would just be an AND to get the index back. I'm not sure INT64's are complete throughout the VM yet, but if they are, this might work. let me file this as a suggestion, and in the new year we'll discuss it, because it would solve the problem without crippling everything - but it is totally dependant on the VM internals.
 
I

icuurd12b42

Guest
I'm going to agree with mike on the "a new graduated engineer can think out the box and improve things..." quote...

It is my experience as well that new graduates don't have enough experience to think outside the box. they think they do because they have not seen the walls of their own box yet... These new concepts your learned in college, it's old fogies like us who invented those.
 
Last edited by a moderator:
H

HammerOn

Guest
So I'm not sure how much of a problem this is, really. @Mike seems pretty assured that the engine's reuse ds handles isn't a problem, and maybe he's right. But in that case, it makes me wonder why they even have a ds_exists function at all.
It makes perfect sense.
We allocate memory, use it, delete and set the variable to a null value. This is common practice (actually a need) even in dynamic typed languages or when the "delete" piece is automatic handled by a garbage collector.
The only cases where you don't set it to a null value is when the allocated memory lasts the lifetime of the object that holds it or when you immediately set other handle to it.
Also, we can't query the actual value of the handles in most languages and there is a chance that a new allocation would be done on top of a freed one so it makes sense that a VM can reuse them.
If all your code follow this concept, ds_exists is in fact not necessary since you can do "ds == undefined" which is how you would do in other languages.

Now that we have an undefined value, the manual could add this to the example codes and helps make it a common practice in GML too.
Ds handles could be special types like arrays to avoid further misconceptions. But I feel like GML development is too much restricted by "backward compatibility" and C-like handles for this...
 
Last edited by a moderator:

GMWolf

aka fel666
I know @Mike is claiming there's a compelling speed advantage to re-using the first available id, but I'd really like to see that proven.
In fact, according to my tests, using a hashmap with quadratic probing gives very good performance with unique identifiers. Yes, not quite as fast as using a simple table, but the advantages of using unique IDs far outweigh it IMO.

It can even lead to better performance, if you use nested data structures. When usung a list of lists, if you delete a list, as it stands, you have to iterate through the whole array to set the entries to undefined if the equal the list being destroyed. That becomes O(N) performance as opposed to O(1) performance if using unique IDs with a hashmap.
 

GMWolf

aka fel666
What a load of tosh. I've hired a lot of them, most of the ones we interview barely know how many bits are in an INT, or even what a "bit" is
Well, yeah. Number of bits in an INT is system dependant....
But if they can't tell you that, you probably haven't been looking at graduates from respectable universities ^^
 
T

Timothy

Guest
Well, yeah. Number of bits in an INT is system dependant....
But if they can't tell you that, you probably haven't been looking at graduates from respectable universities ^^
Lol, you can learn more online than you ever can at a school... Any school. Some say you can't trust online sources (which is true for some) but truthfully you can't trust instructors either. All voice opinion over fact.

As for the topic. I recently had a bug because of this and it stumped for a while but lesson was learned. Though it would be nice to have unique identifiers, I don't think it's completely necessary if it isn't able to be implemented in GMS current state. (I would much rather see simple c struct like data containers, I know, off topic)
 

Llama_Code

Member
Well, yeah. Number of bits in an INT is system dependant....
But if they can't tell you that, you probably haven't been looking at graduates from respectable universities ^^
Actually, I have seem graduates, even from respectful universities, that are taught less than I was taught 20 years ago, simply because they take for granted what they have available. Things like bit shifting are lost on these people because they are taught to just do the arithmetic since computers have enough power now so it doesent matter.

In most cases, that probably true, but I think they should still know this stuff because they are likely to encounter it (which is how we found our 4 month out of school new hire didn't understand it) and in some cases it can still be needed depending on where you code is being used.

You can bet they think they know it all, but most are taught in very controlled circumstances and the code they are working with is the same code everyone else worked with, and there is very little problem solving. They don't hit the same walls a real work developer does, and they don't have to work within the same limitations of old school developers. Imagine making your platformer and trying to cram it in to 384 kB. Most of them would look at you like your absolutely insane.

My Nephew recenty graduated from a very accredited school, and was bored most of the way through because he started coding at 8, and had self taught himself most things and made several programs before he was 15. They were teaching him very little he had not already learned himself, but still landed a job and hit some trouble because he was faced with a problem outside the realm of what even he had dealt with.

The only real way to get seasoned is to do it in real world situations, and solve the problems that slap you in your face you never knew could exist until you hit them, because they won't teach you that in school.

I will give the YoYo guys credit for one thing, no way could I take on a code base like GameMaker, that I am sure was all patched up and hacked to hell from Mark's late night coding sessions, and make what they did out of it.
 

GMWolf

aka fel666
Actually, I have seem graduates, even from respectful universities, that are taught less than I was taught 20 years ago, simply because they take for granted what they have available. Things like bit shifting are lost on these people because they are taught to just do the arithmetic since computers have enough power now so it doesent matter.

In most cases, that probably true, but I think they should still know this stuff because they are likely to encounter it (which is how we found our 4 month out of school new hire didn't understand it) and in some cases it can still be needed depending on where you code is being used.

You can bet they think they know it all, but most are taught in very controlled circumstances and the code they are working with is the same code everyone else worked with, and there is very little problem solving. They don't hit the same walls a real work developer does, and they don't have to work within the same limitations of old school developers. Imagine making your platformer and trying to cram it in to 384 kB. Most of them would look at you like your absolutely insane.

My Nephew recenty graduated from a very accredited school, and was bored most of the way through because he started coding at 8, and had self taught himself most things and made several programs before he was 15. They were teaching him very little he had not already learned himself, but still landed a job and hit some trouble because he was faced with a problem outside the realm of what even he had dealt with.

The only real way to get seasoned is to do it in real world situations, and solve the problems that slap you in your face you never knew could exist until you hit them, because they won't teach you that in school.

I will give the YoYo guys credit for one thing, no way could I take on a code base like GameMaker, that I am sure was all patched up and hacked to hell from Mark's late night coding sessions, and make what they did out of it.
I dont know what universities you are talking about. I am currently at Newcastle University. Not top ranked, but still respectable IMO. And we learn all of that. Sure, some students dont qute get it, but its still in our curriculum...

[edit] also, about using linear/quadratic probing, it does devolve a little as time goes on unless you rehash. doing more tests
 
Last edited:

Mike

nobody important
GMC Elder
Well...we were just talking about windows, and "normal" word sizes, so you'd think they'd have a stab at it - 32bit machines (at the time) and all....

Actually.... we've interviewed loads that don't even know how negative integers are stored - this is across the board, not just from one place. They just don't teach the low levels any more, it's horrible. It's also due to the fact many go in looking for a 9 to 5 job so don't play at home with ideas/demos etc. You lose so much when you're not passionate about your work and don't try things at home.

Aside from a pointer, you'll never beat a straight index. Hashing code is at least 3 or 4 times longer (if you're lucky), not to mention the all the extra memory it touches. It is dependant on how much you use them, but certain things - like my emulator, would suffer massively because all the emulated memory is held in a ds_grid, so that would cripple it I suspect. For "casual" use, you probably wouldn't notice, as the slight differences in a single lookup would be swallowed by the "app", but heavy usage would suck.
 
R

rui.rosario

Guest
Well...we were just talking about windows, and "normal" word sizes, so you'd think they'd have a stab at it - 32bit machines (at the time) and all....

Actually.... we've interviewed loads that don't even know how negative integers are stored - this is across the board, not just from one place. They just don't teach the low levels any more, it's horrible. It's also due to the fact many go in looking for a 9 to 5 job so don't play at home with ideas/demos etc. You lose so much when you're not passionate about your work and don't try things at home.

Aside from a pointer, you'll never beat a straight index. Hashing code is at least 3 or 4 times longer (if you're lucky), not to mention the all the extra memory it touches. It is dependant on how much you use them, but certain things - like my emulator, would suffer massively because all the emulated memory is held in a ds_grid, so that would cripple it I suspect. For "casual" use, you probably wouldn't notice, as the slight differences in a single lookup would be swallowed by the "app", but heavy usage would suck.
My University is not taken in a very high regard and even there they taught all those items (two's complement, bit shifting, bits in general xD, etc).

Maybe it's just the sample your job offers' attracted that are just.... "flawed" (can't think of a better word)
 

GMWolf

aka fel666
Actually.... we've interviewed loads that don't even know how negative integers are stored - this is across the board, not just from one place. They just don't teach the low levels any more, it's horrible. It's also due to the fact many go in looking for a 9 to 5 job so don't play at home with ideas/demos etc. You lose so much when you're not passionate about your work and don't try things at home.
Again, what universities? This was in our first years of study...
Though i must admit, I cant quite remember exactly what systems uses what, with 2's complements and all...

Aside from a pointer, you'll never beat a straight index. Hashing code is at least 3 or 4 times longer (if you're lucky), not to mention the all the extra memory it touches. It is dependant on how much you use them, but certain things - like my emulator, would suffer massively because all the emulated memory is held in a ds_grid, so that would cripple it I suspect. For "casual" use, you probably wouldn't notice, as the slight differences in a single lookup would be swallowed by the "app", but heavy usage would suck.
Yeah, My tests indicate a 2x loss in performance. Though im sure this can be improved greatly: The test was written in java, using tons of objects on the hashing side of things. Im sure if i where to rewrite it in C to use promitive types, the difference would drop down to about 1.5x
Its true that its not great when simply accessing a data structure, but there are scenarios where you loose a ton of performance, such as nested data structures. having to itterate through data structures to set values to undefined seems like a massive loss in time.

Also, I would like to point out that if GM used data structure types, then would we not get a massive gain in performance?
 

Alice

Darts addict
Forum Staff
Moderator
Wait, wait, let me get this straight.

The problem here is that right now data structures are stored in re-usable number indices, which makes for various oddities and such, right?

...isn't it something that will be fixed altogether when a proper type system finally gets implemented, as Russell outlined here? ^^'

I dunno, I just feel like there's a somewhat large discussion going on, over something that will eventually become outdated anyway...
 

Mike

nobody important
GMC Elder
Again, what universities? This was in our first years of study...
Though i must admit, I cant quite remember exactly what systems uses what, with 2's complements and all...

Yeah, My tests indicate a 2x loss in performance. Though im sure this can be improved greatly: The test was written in java, using tons of objects on the hashing side of things. Im sure if i where to rewrite it in C to use promitive types, the difference would drop down to about 1.5x
Its true that its not great when simply accessing a data structure, but there are scenarios where you loose a ton of performance, such as nested data structures. having to itterate through data structures to set values to undefined seems like a massive loss in time.

Also, I would like to point out that if GM used data structure types, then would we not get a massive gain in performance?
tut-tut..... binary and how data is stored should be something that's always on instant recall :D

It might do depending on how it was all put together, but again.... this would be a long way off - if ever. It's still very much a "wish" at the moment. We've been wanting it here for almost as long as we've been working here.... that and proper structures :)

This might go away over time, but that time could be a looooong time. If we can do a short term fix that keeps performance, and gives mostly what folk are after, then i have no problems with it.
 

GMWolf

aka fel666
that and proper structures
Honestly I don't understand how this isnt priority number one right now!
Structs would help GM so much! I can't believe how much time I loose using arrays and enums to emulate them!
 

Mike

nobody important
GMC Elder
yes..... but adding it is a massive task, and there's always been even more important things on the plate.
 

csanyk

Member
Wait, wait, let me get this straight.

The problem here is that right now data structures are stored in re-usable number indices, which makes for various oddities and such, right?

...isn't it something that will be fixed altogether when a proper type system finally gets implemented, as Russell outlined here? ^^'

I dunno, I just feel like there's a somewhat large discussion going on, over something that will eventually become outdated anyway...
Well that's interesting! And promising... ish... but @rwkay said it's in the indefinite future, so who knows when or if it'll really happen?

I wouldn't want to waste time on a solution to a problem for which an official fix will be forthcoming. But it'd be good to have some idea of when that might be.
 

csanyk

Member
Aside from a pointer, you'll never beat a straight index.
I'll buy this... I have no counter-argument against it, and no reason to believe it isn't true.

That said, WHY *recycle* the index, using the first available index that is not currently assigned? That's what seems crazy to me.

It would be so much better if you just auto-incremented the index counter every time a ds_*_create() returned an index. You could still have lightning fast access to the index, and the problem code that we've been talking with would not be a problem any longer:

Code:
a = ds_stack_create(); //returns index 0; index_counter++
b = a; copies 0 to b
ds_stack_destroy(a); releases index 0 
c = ds_stack_create(); returns index 1

ds_exists(a, ds_type_stack); //false; 0 is still a destroyed index; 0 was not reused.
ds_exists(b, ds_type_stack); //false; as above.
ds_exists(c, ds_type_stack); //true
True, you might have to recycle the indexes *eventually* after the game runs a really long time, but if you use int64 for the index counter, that's going to happen sometime after the sun turns into a red giant...
 

GMWolf

aka fel666
I'll buy this... I have no counter-argument against it, and no reason to believe it isn't true.

That said, WHY *recycle* the index, using the first available index that is not currently assigned? That's what seems crazy to me.

It would be so much better if you just auto-incremented the index counter every time a ds_*_create() returned an index. You could still have lightning fast access to the index, and the problem code that we've been talking with would not be a problem any longer:

Code:
a = ds_stack_create(); //returns index 0; index_counter++
b = a; copies 0 to b
ds_stack_destroy(a); releases index 0
c = ds_stack_create(); returns index 1

ds_exists(a, ds_type_stack); //false; 0 is still a destroyed index; 0 was not reused.
ds_exists(b, ds_type_stack); //false; as above.
ds_exists(c, ds_type_stack); //true
True, you might have to recycle the indexes *eventually* after the game runs a really long time, but if you use int64 for the index counter, that's going to happen sometime after the sun turns into a red giant...
That would make for a huge lookup table!
Its far better to use an array, paired with a queue.
The array would map the index to the Ds, whilst the queue would hold the index of indices to recycle. This leads to O(1) time.
Doing it your way would require searching for a free space, resulting in o(n) time....
 

Alice

Darts addict
Forum Staff
Moderator
@csanyk: I think the reason why indices should be re-used is so that the collection of structures doesn't grow too large. Imagine creating and destroying ~100 lists every step, or something. After 10 seconds on 50 FPS, you have a list with 50k entries, out of which only 100 refer to the valid structures at a time. What's worse, the more structures you create, the larger memory you use to keep that overly long list of non-reusable indices.

----------

Of course, that could be avoided by accessing structures via a hash map instead of list, so you wouldn't have list with 100 valid indices and 49900 effectively useless indices, but instead a map with 100 entries. However, I think what Mike is saying that such a hash map performs significantly poorer than a list, and in some performance-critical applications using structures to great extent it would cause a big issue.

At the same time, I see fel's point about hash map being useful in many cases, and if the map was common for all data structures, it means we would be able to tell structures apart from each other (still no ability to tell integers apart from structures, though). Perhaps that could be implemented as some sort of option (i.e. developer could choose whether a game should use high-performing structures list or strongly-typed structures map; kind of like people could choose whether to short-circuit or not) but it might be too much effort to bother.

A little quality of life upgrade I could suggest: make ds_*_destroy(...) functions return undefined. That way instead of something like:
Code:
a = ds_list_create();
...
ds_list_destroy(a);
a = undefined;
it could be instead:
Code:
a = ds_list_create();
...
a = ds_list_destroy(a);
It doesn't seem like a massive improvement (and it doesn't fix some other issues coming with clashing indices, like arbitrary JSON structures), but it does help encouraging the good practices at least a little (and aside from legacy reasons, I don't see why it should return 0 as it does now, anyway).
 

csanyk

Member
That would make for a huge lookup table!
Its far better to use an array, paired with a queue.
The array would map the index to the Ds, whilst the queue would hold the index of indices to recycle. This leads to O(1) time.
Doing it your way would require searching for a free space, resulting in o(n) time....
No, that's basically what I had in mind. I just didn't speak to the precise implemenation details.

I'm not trying to propose a fully engineered idea; I'm just stating what I feel are ideal requirements. I have no idea how the indexes for data structures are stored/tracked internally within the runtime. I just don't see how it slows the engine down to have to look up data structure zero vs. data structure one thousand, regardless of how many data structures currently exist. Accessing a particular data structure by index should always take the same amount of time, given a proper implementation (whatever that is.)

Maybe I don't fully understand some key points, but as far as I can see, it shouldn't take the CPU any longer to address byte 0 in RAM than byte 2^64-1 -- if you provide it the address, it retrieves it. By similar logic, I don't see why it should slow down the GM engine to access a data structure who's got a GUID rather than a data structure who always has the smallest available ID at the time it was created.
 

GMWolf

aka fel666
No, that's basically what I had in mind. I just didn't speak to the precise implemenation details.

I'm not trying to propose a fully engineered idea; I'm just stating what I feel are ideal requirements. I have no idea how the indexes for data structures are stored/tracked internally within the runtime. I just don't see how it slows the engine down to have to look up data structure zero vs. data structure one thousand, regardless of how many data structures currently exist. Accessing a particular data structure by index should always take the same amount of time, given a proper implementation (whatever that is.)

Maybe I don't fully understand some key points, but as far as I can see, it shouldn't take the CPU any longer to address byte 0 in RAM than byte 2^64-1 -- if you provide it the address, it retrieves it. By similar logic, I don't see why it should slow down the GM engine to access a data structure who's got a GUID rather than a data structure who always has the smallest available ID at the time it was created.
The problem here is with memory usage...
If each Id is a 32 bit integer, and you allow ids to range fro 0 to INT_MAX, then you are using all available memory just to have a table of indices! That's simply not a viable solution.

If you use a simple table, then you must reuse indices.
If you want to have unique ID's, then you cannot effectively use a simple table.

Unless you use a far more complex data structure (perhaps a heap or something would do), then those two statements hold true.

Actually, would a heap not do? Isnt that how OO languages handle their object identifiers?
 
I

icuurd12b42

Guest
It's always a question of balance between memory size and access speed (defined by the method)

you could have a 32k array each element holding a list and every time you create a list the index climbs. but what happens when you reach the last number, increase the array size to 64 k? how long until you run out? fall back to 0 and then proceed on looping to find the entry that's free? that's basically what's going on now I guess. but in reverse. is there a free spot in the allocated array? no? resize the array and put the new ds in there

a hash table for lookup.. (ds_map) whatever is the faster method that takes less memory
 

csanyk

Member
The problem here is with memory usage...
If each Id is a 32 bit integer, and you allow ids to range fro 0 to INT_MAX, then you are using all available memory just to have a table of indices! That's simply not a viable solution.

If you use a simple table, then you must reuse indices.
If you want to have unique ID's, then you cannot effectively use a simple table.

Unless you use a far more complex data structure (perhaps a heap or something would do), then those two statements hold true.

Actually, would a heap not do? Isnt that how OO languages handle their object identifiers?
I don't get it. I agree with you that what you just said is basically what I have in mind, and now you're saying there's a problem with it.

I don't get the skepticism. GUID is a thing. There have been implementations. It is a solved problem.

I don't see much point in continuing to discuss, for me personally, until I actually get to a point where I'm having problems, or until I can actually address the issue with a fix.

I gather that others are already at this point. Me, I'm glad to have gained some understanding and awareness of the issue.
 

GMWolf

aka fel666
I don't get the skepticism. GUID is a thing. There have been implementations. It is a solved problem.
As far as I'm aware, GUID or UUID is generally done with maps or heaps.

@Mike says that Maps are too slow, and that using a simple lookup table is far faster.
This is true.
However, using a simple lookup table means the table must be as large as the largest ID. Therefore, to avoid eating up all available memory, YYG decided to reuse IDs.
I agree this is a sensible solution, when using a simple lookup table. Though I do feel a map with unique ids would be nicer.

Essentially, I was disagreeing with you that both the simple lookup table and unique IDs could be used. If unique IDs are used, a simple lookup table simply uses too much memory.

Much like you however, I would prefer GM to adopt a unique ID system at the cost of performance. I was not disagreeing on that.
 

JaimitoEs

Member
And this isn't about being stuck in the past, it's about not wasting CPU time on stupid things that will cripple performance in a lot of cases. CPU performance with data structures is the ONLY reason I don't want to change them, it's ALL about performance and how much it would hurt.

What a load of tosh. I've hired a lot of them, most of the ones we interview barely know how many bits are in an INT, or even what a "bit" is - I kid you not. There's also a reason I've been in this industry for 27 years, so try not be quite so insulting.
Right, i donĀ“t want to be unsulting, sorry about that, but let me disagree with new engineer, yes, maybe they doesĀ“t know what is a bit, but can create good things in the principal languages like java, c, c#.. Why learn the begining of computing when you can focus your skills on the present and learn in the near future the principles of computing?? Are you coded GMS2 "bit to bit"?? My best friend is newly graduated , he knows what is a bit, is passionate of coding, and now, heĀ“s working with AirBus like software engineer with no experience , one of the best company creating aircrafts. Surely the company knows "he is not prepared", but this is a team, a team to teach and give lessons when new staff does not know, and maybe one day this engineer can teach a lesson of his masters. Open your mind, maybe a new staff are not productive instantly, like a new staff with his new job, the team must to give the push toward new talents.

And, what is the problem if your Knowledge are top than a new staff, you are the director to control your staff, is because this you are a CEO or another better rank than newest incorporations, control your team to do what you want and distribute the work.

Well i am nobody to tell suggestions for a company. So the better for me is run and not look back..

Again, sorry about if you feel insulted. I Know you are a big of the industrie with an a big weight over....
 
Last edited:

csanyk

Member
As far as I'm aware, GUID or UUID is generally done with maps or heaps.

@Mike says that Maps are too slow, and that using a simple lookup table is far faster.
This is true.
However, using a simple lookup table means the table must be as large as the largest ID. Therefore, to avoid eating up all available memory, YYG decided to reuse IDs.
I agree this is a sensible solution, when using a simple lookup table. Though I do feel a map with unique ids would be nicer.

Essentially, I was disagreeing with you that both the simple lookup table and unique IDs could be used. If unique IDs are used, a simple lookup table simply uses too much memory.

Much like you however, I would prefer GM to adopt a unique ID system at the cost of performance. I was not disagreeing on that.
I appreciate the clarification :) I understand better now.
 

csanyk

Member
@csanyk: I think the reason why indices should be re-used is so that the collection of structures doesn't grow too large. Imagine creating and destroying ~100 lists every step, or something. After 10 seconds on 50 FPS, you have a list with 50k entries, out of which only 100 refer to the valid structures at a time. What's worse, the more structures you create, the larger memory you use to keep that overly long list of non-reusable indices.

----------

Of course, that could be avoided by accessing structures via a hash map instead of list, so you wouldn't have list with 100 valid indices and 49900 effectively useless indices, but instead a map with 100 entries. However, I think what Mike is saying that such a hash map performs significantly poorer than a list, and in some performance-critical applications using structures to great extent it would cause a big issue.

At the same time, I see fel's point about hash map being useful in many cases, and if the map was common for all data structures, it means we would be able to tell structures apart from each other (still no ability to tell integers apart from structures, though). Perhaps that could be implemented as some sort of option (i.e. developer could choose whether a game should use high-performing structures list or strongly-typed structures map; kind of like people could choose whether to short-circuit or not) but it might be too much effort to bother.

A little quality of life upgrade I could suggest: make ds_*_destroy(...) functions return undefined. That way instead of something like:
Code:
a = ds_list_create();
...
ds_list_destroy(a);
a = undefined;
it could be instead:
Code:
a = ds_list_create();
...
a = ds_list_destroy(a);
It doesn't seem like a massive improvement (and it doesn't fix some other issues coming with clashing indices, like arbitrary JSON structures), but it does help encouraging the good practices at least a little (and aside from legacy reasons, I don't see why it should return 0 as it does now, anyway).
I'm OK with id's being recycled... eventually. I recognize that you can't use all the ram just to hold the address space of every possible id. But could they at least not re-use them *immediately*? Is there a middle ground solution, whereby we use an id, still in a lookup table, and it can't be re-apportioned for a time after being destroyed?
 

GMWolf

aka fel666
I'm OK with id's being recycled... eventually. I recognize that you can't use all the ram just to hold the address space of every possible id. But could they at least not re-use them *immediately*? Is there a middle ground solution, whereby we use an id, still in a lookup table, and it can't be re-apportioned for a time after being destroyed?
All that would do is make it harder to notice a bug!
If your code currently has bugs because of this, then changing it to have ids recycled after will only delay the bugs, not fix them.
With the current system, you have to do as mike said and set all your references to undefined. At which point it doesnt matter if ids are reused.

Even if ids are recycled later, you should still set your references to undefined.

This is, unless you can intrinsically know if a DS exists or not.
 

csanyk

Member
All that would do is make it harder to notice a bug!
Well, granted. But it would also make it less likely that a bug would cause a problem that would be noticed... so hopefully, the player could enjoy the game and never notice there was a potential bug. It cuts both ways. Although, I would agree with you, anything that makes bugs harder to find is something that should be considered very carefully.

I'm just saying that if it's inevitable that the unique id's for data structures will be exhausted at some point, rather than running out of data structures, I'd accept recycling the id's.
 

Mike

nobody important
GMC Elder
Reuse will just depend on how big the table is, chances are it's (currently) a simple loop to find one, so the 1st free one (I'd guess) is what's returned. But being a simple array, the array size will be the maximum number you've had allocated at once (give or take). So if you allocate 4, then free them, the array size will only be (around) about that number. If you have thousands, then the array size will be that. We tend not to shrink the array as this is wasted time - usually if you've allocated this amount before, you will do so again.

Finding a chunk of memory from a handle can be done in several ways. Either with a pointer, an index, or some kind of "handle/GUID". The index and the pointer will both be subject to the reuse as the if structures are ever pooled for faster allocation, then you may well get the same pointer back. Indices we know about. For true handles/GUIDs you have to do look up a table to find out the real index/pointer. This is slower as you are doing more work.

As I said earlier.... it is possible we might be able to use a 64bit INT as the handle, so the lower part is the index, and the upper part the "unique" identifier, but this will depend on the VM internals which I will with folk here discuss next year. This would solve the issue as we could throw away the unique part and just keep the index. This shouldn't really be any slower than what we have, but will need testing and a chunk of code rewritten so might take time regardless.
 

GMWolf

aka fel666
chances are it's (currently) a simple loop to find one
And thats supposed to be efficient? please, at least use a queue to keep track of ids.
Code:
///finding next id
if (queue.size > 0) {
    return queue.dequeue();
} else {
   return nextID++;
}
Code:
///Delete an id
queue.enqueue(id_to_delete);
Thats O(1) rather that O(n).


As I said earlier.... it is possible we might be able to use a 64bit INT as the handle, so the lower part is the index, and the upper part the "unique" identifier, but this will depend on the VM internals which I will with folk here discuss next year.
Thats would be great! :)
 

Mike

nobody important
GMC Elder
And thats supposed to be efficient? please, at least use a queue to keep track of ids.
Ideally it should be a stack allocater, but really you shouldn't be allocating a lot of them for this to make a massive difference. When you allocate anything, you have to assume its going deep down into memory allocation and doing all kinds of nasty things. Trust me.....a small loop to get an index will be the least of your issues.

If you've allocated tens of thousands and are freeing/allocating the 20,000th one over and over, you'll probably notice, but for normal small numbers it'll not be much.
On the odd occasion you DO run into this, a simple pool on you end would solve it. yes, it'd be nice to have internally - I don't disagree, but it's not a big enough issue "at the moment" to really warrant the time/testing to change it.
 

GMWolf

aka fel666
Unique IDs is something I feel very strongly about.
But thinking about it some more, and how i deal with memory in other languages, this isnt actually such a big deal.
Yes I got some bugs because I assumed ids where unique, but that was mostly my fault.


:)
Also I'm starting to understand what some people mean when they say OO makes for bad programmers ^^ I'm totally stuck in that way of thinking now.
 
S

Startoz

Guest
@Mike , I know it's 9 months after the discussion, but I would like to point out that a unique identifier would have the added benefit that the engine could differentiate between different kinds of data structures:
  • using ds_map_destroy() instead of ds_list_destroy() by mistake can currently destroy a "random" ds_map, because the indexes of ds_maps and ds_lists are separate (e.g. both the first ds_map and the first ds_list that you create have index 0). with a unique identifier, the function will throw an error instead of deleting a "random" structure.
  • adding a list to a map or viceversa could automatically add them as nested data structures, so functions like ds_map_add_list() would not be needed anymore
 

GMWolf

aka fel666
@Mike , I know it's 9 months after the discussion, but I would like to point out that a unique identifier would have the added benefit that the engine could differentiate between different kinds of data structures:
  • using ds_map_destroy() instead of ds_list_destroy() by mistake can currently destroy a "random" ds_map, because the indexes of ds_maps and ds_lists are separate (e.g. both the first ds_map and the first ds_list that you create have index 0). with a unique identifier, the function will throw an error instead of deleting a "random" structure.
  • adding a list to a map or viceversa could automatically add them as nested data structures, so functions like ds_map_add_list() would not be needed anymore
It's already been confirmed that YYG are looking to add actual data types for data structures (much like how arrays currently work).
Check out the roadmap.
 

Yal

šŸ§ *penguin noises*
GMC Elder
Lol, when I saw this pop up again I assumed the discussion had been going on all this time instead of the topic getting necrobumped :p

Great news about actual types, though... I've been looking forward to that~
 
Top