A
Ariak
Guest
Update:
Hey Folks,
as you all know with the introduction of layer alot has changed. New Concepts pose new challenges. You can read up on layers in this guide @MaddeMichael .
A very major change is how depth ordering works with GMS2. The old depth=-y trick still works, but its far from ideal. The new layers also have depth, meaning your objects could dissappear behind them. Also every depth level creates a new sublayer. This layer is self-managed and will be deleted once there are no objects at this layer. @Mike told us that GMS 1.4 works with priority lists to sort objects and manage depth, and that this was causing immense slowdown the graphics pipeline.
As a consequence the new layers speed everything up in the graphics pipeline. On the flipside, this means that isometric games and RPG / Zelda-likes face a bit of a conundrum: we have to create a depth ordering system ourselves.
I'm going to analyze different approaches. 2 have already been suggested. I want to showcase a third method.
All follow the same principle. Grab all the id's of the instances we want to depth order, once we've got them all sort and draw them in the desired order with a loop. Look at the Dungeon Demo provided with GMS2 to see how this is achieved. I will be focusing on the more technical details here, namely which method to use under which circumstances.
Nested List within Grids
Looking at the demo I was stumped to see that a grid was used which height was that of every single pixel in the room along the y axis for depth ordering. Each entry was then assigned a list. This means a 4096px high room has 1 grid + 4096 list entries. This seemed extremely wasteful in terms of memory. However, we completely forgoe the need to sort our data structures! Simply grab the y-coordinate of our instance, reference that grid entry grid[#0,y] and add our instances id to that list. As all entries will by default have the same y value this is extremely efficient. Later we simply loop our grid, check if a list is empty, and if not we grab all those instances and draw them. Afterwards we clear the list. Neat.
The downside to this approach is that it does not allow for negative values. However it is extremely fast when using many objects at once. If we only loop grid entries that actually fall within the view area we can optimize this even further.
Priority Queues
Suggested by @MaddeMichael in the aforementioned tutorial on layers, we could cut down on tons of memory by simply having a single priority list. Priotity lists work akin to regular list, but they contain a second entry. Think the inverse of a 2D array. 2 points (x,y) to reference one value - vs priorities: 1D, but contains 2 values. A value and a priority. Priority Queues handle sorting for us. The idea is thus simple. Take all our objects, add them to the priority - the value being the ID and the priority being the depth. Loop + read all values from a single sorted data structure. Brilliantly simple, were it not for the fact that the lookup speeds for priorities are ... atrocious.
On the plusside we circumvent an nested grid-list data structure and we can automatically skip over all "empty" lists / y-coordinates with no objects. Also we dont get an "grid index out of bounds error" when instances leave the room and have negative y coordinates!
On the other hand reading is so slow that this system quickly falls apart. Add more than ~75 instances for ordered drawing of your lovely pixel art game and the grid-list approach would have been faster. Whatsmore, reading from priority queues will pop the value: read + delete. This prevents us from creating cool effects like the silhouette - also shown in the YoYo Dungeon Demo for GMS2.
Combining the best of both worlds: Binary Lists.
The idea is exactly the same as a priority queue. However instead we will be using binary operators to store 2 values in one, which we can then add to a single ds_list for sorting. If you're not familiar with binary operators read up on them here.
The value added to the list is as follows:
We take our id and then we OR in our y, which is our depth with the old depth=-y trick. Essentially the first 32bits contain our instance id, whilst the next 32 bits contain our depth. Once all instances are added we can sort them with ds_list_sort by depth, as this will be the greater value. This approach also sorts by instance_id number should multiple instances exists with the same y value. Sorting in ascending order and looping through the list from 0 should results in the desired effect when drawing.
Reading an ID from the list is done as follows:
We AND out the 32 bits making up the depth, what remains is our object ID, except its now sorted! We can now proceed as we always do and loop+draw.
Why the best of both worlds? We can now store + sort everything in one data structure, whilst allowing for negative depth values. Further reading values will not clear them enabling silhouette effects etc.
Performance / Speed Comparison
Inherently Priotities and the Binary List follow exactly the same concept. As we add all instances to a single data structure we will have to sort them at some point. Be that whilst adding with priorities, or only once when our binary list is complete. This does not scale very well with instance count. More instances, more possibilities, more lookups required to sort.
The Nested-Grid approach on the other hand is slowed down initially - as no matter how many instances exist - it will have to loop all its grid entries (at least those within the view) and check whether its sublists contain entries. These however do not need to be sorted, as established earlier. This results in a higher "base load", but also incredible scalability.
I'm going to show three Scenarios for different resolutions. Pixel Art Games are fine with low resolutions, thus they also do not need as many list entries in the height grid. This means less lookups per step, regardless of instance count and thus a lower "base load".
In each scenario we will increase the instance count from 25 to 1000. As expected the results show a clear trend.
Measurements were taken for 15 seconds before reseting and increasing the instance count.
Scenario 1: 1080p with 2*64px margin
Scenario 2: 540p with 2*32px margin
Scenario 3: 270p with 2*16px margin
Code
Results
As we can see from the graphs the higher the resolution the better the binary list approach - lower resolution favor the nested grid-list approach.
@ 1080px height binary lists draw even wtih grids at around 375 depth ordered instances.
@ 540px height break even is roughly 250 depth sorted instances
@ 270px height the balance is struck just short of 175 depth ordered instances.
The standard priority list is out of the picture. The binary list simply scales better, and is also more efficient to begin with. However in the lower ranges, which are for more likely, the binary list is also consistenly outperforming the nested list-grid approach by a significant margin! Many object do not need depth ordering to boot, with the new layer system and the remainder will seldomly exceed 250! During a playthrough the instance count can vary wildly - but there are certain limits. If your are certain you need to accomodate an extreme amount of depth odered objects than the grid-list approach is the way to go! If you can be fairly certain this is not the case then definetly go with a binary list approach.
Final remarks
Once we have an instances id ready for drawing we have three options. In order of speed (fastest to slowest):
draw_sprite>>with(o){draw_self()}>>draw_sprite_ext. The latter two are extremely close and both allow for rotation, alpha, blending and scaling. Again this depends on what you are going for. There is no optimal solution.
On why we use an 64bit number for the binary list approach. An instances ID begins with 100000 and can thus not be stored in an 16bit integer. Furthermore adding and deleting instances at runtime does not reset this counter. Meaning it will continually grow. if we used 17 bit values we could store 2^17-100k =~ 31k instances before our method fails and the id will be returned corrupted. The longer we play the more likely this is to happen. Increasing this value will in turn only decrease the possibilites to store our y/depth value in the remaining bits. A balance could be struck at say 19bits for the id, and the remainder for y. I simply chose to add more padding and allow for most eventualities.
I hope you found this insightful and that it will help you with depth ordering in GMS2.
Happy Coding!
Edit: Fixed a major bug. Consequently reran the tests. Also included @rui.rosario idea to store the ID in a seperate list and pass the index of the list entry into the binaryList with the depth for sorting. This also enables 32bit whilst avoiding unfathomably unlikely ID errors, caused by instances id exceeding 32bits. See "Binary ID".
- I would like to point out that I wrote a techblog on z-tilting, a more advanced technique for 2.5d depth sorting. Check it out here: https://www.yoyogames.com/blog/458/z-tilting-shader-based-2-5d-depth-sorting
- @MirthCastle came up with an easy solution to replace depth=-y with layers. Long term GM users will feel right at home with his system and I recommend checking it out! https://forum.yoyogames.com/index.php?threads/new-2-5d-fake-3d-depth-sorting-grid-layers.50250/
Hey Folks,
as you all know with the introduction of layer alot has changed. New Concepts pose new challenges. You can read up on layers in this guide @MaddeMichael .
A very major change is how depth ordering works with GMS2. The old depth=-y trick still works, but its far from ideal. The new layers also have depth, meaning your objects could dissappear behind them. Also every depth level creates a new sublayer. This layer is self-managed and will be deleted once there are no objects at this layer. @Mike told us that GMS 1.4 works with priority lists to sort objects and manage depth, and that this was causing immense slowdown the graphics pipeline.
As a consequence the new layers speed everything up in the graphics pipeline. On the flipside, this means that isometric games and RPG / Zelda-likes face a bit of a conundrum: we have to create a depth ordering system ourselves.
I'm going to analyze different approaches. 2 have already been suggested. I want to showcase a third method.
All follow the same principle. Grab all the id's of the instances we want to depth order, once we've got them all sort and draw them in the desired order with a loop. Look at the Dungeon Demo provided with GMS2 to see how this is achieved. I will be focusing on the more technical details here, namely which method to use under which circumstances.
Nested List within Grids
Looking at the demo I was stumped to see that a grid was used which height was that of every single pixel in the room along the y axis for depth ordering. Each entry was then assigned a list. This means a 4096px high room has 1 grid + 4096 list entries. This seemed extremely wasteful in terms of memory. However, we completely forgoe the need to sort our data structures! Simply grab the y-coordinate of our instance, reference that grid entry grid[#0,y] and add our instances id to that list. As all entries will by default have the same y value this is extremely efficient. Later we simply loop our grid, check if a list is empty, and if not we grab all those instances and draw them. Afterwards we clear the list. Neat.
The downside to this approach is that it does not allow for negative values. However it is extremely fast when using many objects at once. If we only loop grid entries that actually fall within the view area we can optimize this even further.
Priority Queues
Suggested by @MaddeMichael in the aforementioned tutorial on layers, we could cut down on tons of memory by simply having a single priority list. Priotity lists work akin to regular list, but they contain a second entry. Think the inverse of a 2D array. 2 points (x,y) to reference one value - vs priorities: 1D, but contains 2 values. A value and a priority. Priority Queues handle sorting for us. The idea is thus simple. Take all our objects, add them to the priority - the value being the ID and the priority being the depth. Loop + read all values from a single sorted data structure. Brilliantly simple, were it not for the fact that the lookup speeds for priorities are ... atrocious.
On the plusside we circumvent an nested grid-list data structure and we can automatically skip over all "empty" lists / y-coordinates with no objects. Also we dont get an "grid index out of bounds error" when instances leave the room and have negative y coordinates!
On the other hand reading is so slow that this system quickly falls apart. Add more than ~75 instances for ordered drawing of your lovely pixel art game and the grid-list approach would have been faster. Whatsmore, reading from priority queues will pop the value: read + delete. This prevents us from creating cool effects like the silhouette - also shown in the YoYo Dungeon Demo for GMS2.
Combining the best of both worlds: Binary Lists.
The idea is exactly the same as a priority queue. However instead we will be using binary operators to store 2 values in one, which we can then add to a single ds_list for sorting. If you're not familiar with binary operators read up on them here.
The value added to the list is as follows:
Code:
binlist[| i]=(id|y<<32);
Reading an ID from the list is done as follows:
Code:
var oID=binlist[| i]&$ffffffff;
Why the best of both worlds? We can now store + sort everything in one data structure, whilst allowing for negative depth values. Further reading values will not clear them enabling silhouette effects etc.
Performance / Speed Comparison
Inherently Priotities and the Binary List follow exactly the same concept. As we add all instances to a single data structure we will have to sort them at some point. Be that whilst adding with priorities, or only once when our binary list is complete. This does not scale very well with instance count. More instances, more possibilities, more lookups required to sort.
The Nested-Grid approach on the other hand is slowed down initially - as no matter how many instances exist - it will have to loop all its grid entries (at least those within the view) and check whether its sublists contain entries. These however do not need to be sorted, as established earlier. This results in a higher "base load", but also incredible scalability.
I'm going to show three Scenarios for different resolutions. Pixel Art Games are fine with low resolutions, thus they also do not need as many list entries in the height grid. This means less lookups per step, regardless of instance count and thus a lower "base load".
In each scenario we will increase the instance count from 25 to 1000. As expected the results show a clear trend.
Measurements were taken for 15 seconds before reseting and increasing the instance count.
Scenario 1: 1080p with 2*64px margin
Scenario 2: 540p with 2*32px margin
Scenario 3: 270p with 2*16px margin
Code
Edit: I'm merely bitshifting by 20 for the binary list in this code snippet. As i've shown in an answer below whether we shift by 16,20 or 32% has minimal impact on performance. 32 was somehow even faster...
9th Sept 2018: The picture was deleted by the filehost and sadly I can't find the project anymore.
9th Sept 2018: The picture was deleted by the filehost and sadly I can't find the project anymore.
Results
As we can see from the graphs the higher the resolution the better the binary list approach - lower resolution favor the nested grid-list approach.
@ 1080px height binary lists draw even wtih grids at around 375 depth ordered instances.
@ 540px height break even is roughly 250 depth sorted instances
@ 270px height the balance is struck just short of 175 depth ordered instances.
The standard priority list is out of the picture. The binary list simply scales better, and is also more efficient to begin with. However in the lower ranges, which are for more likely, the binary list is also consistenly outperforming the nested list-grid approach by a significant margin! Many object do not need depth ordering to boot, with the new layer system and the remainder will seldomly exceed 250! During a playthrough the instance count can vary wildly - but there are certain limits. If your are certain you need to accomodate an extreme amount of depth odered objects than the grid-list approach is the way to go! If you can be fairly certain this is not the case then definetly go with a binary list approach.
Final remarks
Once we have an instances id ready for drawing we have three options. In order of speed (fastest to slowest):
draw_sprite>>with(o){draw_self()}>>draw_sprite_ext. The latter two are extremely close and both allow for rotation, alpha, blending and scaling. Again this depends on what you are going for. There is no optimal solution.
On why we use an 64bit number for the binary list approach. An instances ID begins with 100000 and can thus not be stored in an 16bit integer. Furthermore adding and deleting instances at runtime does not reset this counter. Meaning it will continually grow. if we used 17 bit values we could store 2^17-100k =~ 31k instances before our method fails and the id will be returned corrupted. The longer we play the more likely this is to happen. Increasing this value will in turn only decrease the possibilites to store our y/depth value in the remaining bits. A balance could be struck at say 19bits for the id, and the remainder for y. I simply chose to add more padding and allow for most eventualities.
I hope you found this insightful and that it will help you with depth ordering in GMS2.
Happy Coding!
Edit: Fixed a major bug. Consequently reran the tests. Also included @rui.rosario idea to store the ID in a seperate list and pass the index of the list entry into the binaryList with the depth for sorting. This also enables 32bit whilst avoiding unfathomably unlikely ID errors, caused by instances id exceeding 32bits. See "Binary ID".
Last edited by a moderator: