[SOLVED] Optimizing Room with Thousands of Objects

G

GrandFree

Guest
I've got a room with about 3000 objects that are marked as optimizable (par_optimize).

So I've divided my room into chunks by creating a "chunk" object for each chunk (obj_optimize_chunk). Each chunk size is the same size as a view.

Each chunk object populates its lists of data for optimizable objects (par_optimize) that are within the chunk, and then if a view comes into contact with the chunk, it's considered active.

If a chunk object is active, it loops through its list of optimizable instances. If any of the instances are within any view and are currently deactivated, then they're set to active, if any of them aren't in any views and are currently activated, then they're set to deactive. I'm also making sure duplicates that cross paths with multiple chunks aren't being processed if they were already processed in the same frame.

It's **technically** working in the sense that it's deactivating/activating the correct number of instances. But it's not working in the sense that the FPS is dropping to extremely low levels. It's even worse when there's 4 views active and they're all near a large number of optimizable objects or if they're spread over entirely separate chunks.

If I leave all optimizable objects deactivated, I'll get around 1000 - 2000 FPS.

If I use this method and I only have 1 view active and only 300 objects are active, my FPS drops to around 80.

If I use this method and I have 4 views active and only 300 objects are active, my FPS drops to around 20. It gets to around 10 if the views are all in completely different chunks spaced far apart.

So yeah I'm honestly not sure what's slowing it down seeing as there aren't that many par_optimize objects even active.

Here's where I populate the chunk list data (cornerX and cornerY refer to the topleft of par_optimize's sprite to ensure any part of the sprite is checked if it's inside a chunk or view):

obj_optimize_chunk Create Event:

Code:
// List of par_optimize objects in the chunk to avoid duplicates
optimizesID = ds_list_create();
optimizesDeactivated = ds_list_create();
optimizesRealX = ds_list_create();
optimizesRealY = ds_list_create();
optimizesWidth = ds_list_create();
optimizesHeight = ds_list_create();

checkAnyChunks = true; // Used to deactivate currently active par_optimize instances on first frame incase they aren't within the active chunk or any views

var thisChunk = id;

// Go through all par_optimize instances
with(par_optimize) {
    var optimizeID = id;

    // If the sprite rectangle is inside of the chunk then we add its data to the relevant chunk lists
    if(rectangle_in_rectangle(cornerX, cornerY, cornerX+abs(sprite_width), cornerY+abs(sprite_height), thisChunk.x, thisChunk.y, thisChunk.x+thisChunk.width, thisChunk.y+thisChunk.height) > 0) {
        ds_list_add(thisChunk.optimizesID, optimizeID); // add its ID
        ds_list_add(thisChunk.optimizesDeactivated, false); // add whether or not it's deactivated
        ds_list_add(thisChunk.optimizesRealX, cornerX); // add the top left corner of its sprite x value
        ds_list_add(thisChunk.optimizesRealY, cornerY); // add the top left corner of its sprite y value
        ds_list_add(thisChunk.optimizesWidth, abs(sprite_width)); // add its sprite width
        ds_list_add(thisChunk.optimizesHeight, abs(sprite_height)); // add its sprite height
    }

}
obj_optimize_chunk Begin Step Event:

Code:
activeChunk = false;

for(var v = 0; v < global.split_screens; v++) {
   // If this chunk object crosses any views, set it to active
   if(!activeChunk && rectangle_in_view(x, y, x+width, y+height, v, 0)) {
       activeChunk = true;
       break;
   }
       
}
       
// checkAnyChunks is set to true initially and then set to false here
// to deactivate objects outside the view(s), or else they won't ever be deactivated
// until you reach those chunks in your view(s)   
if(activeChunk || checkAnyChunks) {
   checkAnyChunks = false;
   var optimizesSize = ds_list_size(optimizesID);
       
   // Loop through the instance data for this chunk
   for(var i = 0; i < optimizesSize; i++) {           
       var opID = ds_list_find_value(optimizesID, i);           
               
       var handledAlready = false;
           
       // Check if the optimized instance ID is already in a handled list in par_controller
       with(par_controller) {
           handledAlready = (ds_list_find_index(optimizerHandled, opID) != -1);
       }
               
       // Another chunk has already processed this instance data for this frame, move on to the next
       if(handledAlready) continue;
           
       // Get the instance data
       var deactivated = ds_list_find_value(optimizesDeactivated, i);
       var opX = ds_list_find_value(optimizesRealX, i);
       var opY = ds_list_find_value(optimizesRealY, i);
       var opW = ds_list_find_value(optimizesWidth, i);
       var opH = ds_list_find_value(optimizesHeight, i);
           
       var inAnyView = false;
           
       // Check if the instance's rectangle (x,y, x+width, y+height) is inside any views
       for(var v = 0; v < global.split_screens; v++) {
           if(rectangle_in_view(opX, opY, opX+opW, opY+opH, v, 0)) {
               inAnyView = true;
               v = global.split_screens;
           }
       }
           
       // If the instance is in any views...
       if(inAnyView) {
           // If it's deactivated and inside any view, reactivate it
           if(deactivated) {
               instance_activate_object(opID);
               ds_list_replace(optimizesDeactivated, i, false);
           }
       }
       // If the instance is not in any views...
       else {
           // If the instance is activated, deactivate it
           if(!deactivated) {
               instance_deactivate_object(opID);
               ds_list_replace(optimizesDeactivated, i, true);
           }
       }
           
       // Mark this instance data as handled for this frame
       // par_controller clears its optimizerHandled list in the End Step event
       with(par_controller) {ds_list_add(optimizerHandled, opID);}
           
       
   }
           
}
par_controller Create Event:
Code:
optimizerHandled = ds_list_create(); // Temporary list containing handled optimized instances for a frame
par_controller End Step Event:
Code:
ds_list_clear(optimizerHandled); // Clear the list, gets reset on every new frame

Why is my FPS dropping to such low levels despite the fact that there are a small number of objects (par_optimize) instances?
 

Amon

Member
It could be the overhead of using lists and iterating each list within the step event.
 

Simon Gust

Member
I'd say there is definetly some slowdown of this being down every step.

I can give you some tipps though:
- deactivation / activation functions are slow by nature
don't do these every step

- dot-operators / with statements are also slow by nature.
Since you have a lot of both inside big loops, it's going to wreck your performance.
What you should do is write the data out to local variables first.

This code
Code:
// List of par_optimize objects in the chunk to avoid duplicates
optimizesID = ds_list_create();
optimizesDeactivated = ds_list_create();
optimizesRealX = ds_list_create();
optimizesRealY = ds_list_create();
optimizesWidth = ds_list_create();
optimizesHeight = ds_list_create();

checkAnyChunks = true; // Used to deactivate currently active par_optimize instances on first frame incase they aren't within the active chunk or any views

var thisChunk = id;

// Go through all par_optimize instances
with(par_optimize) {
   var optimizeID = id;

   // If the sprite rectangle is inside of the chunk then we add its data to the relevant chunk lists
   if(rectangle_in_rectangle(cornerX, cornerY, cornerX+abs(sprite_width), cornerY+abs(sprite_height), thisChunk.x, thisChunk.y, thisChunk.x+thisChunk.width, thisChunk.y+thisChunk.height) > 0) {
       ds_list_add(thisChunk.optimizesID, optimizeID); // add its ID
       ds_list_add(thisChunk.optimizesDeactivated, false); // add whether or not it's deactivated
       ds_list_add(thisChunk.optimizesRealX, cornerX); // add the top left corner of its sprite x value
       ds_list_add(thisChunk.optimizesRealY, cornerY); // add the top left corner of its sprite y value
       ds_list_add(thisChunk.optimizesWidth, abs(sprite_width)); // add its sprite width
       ds_list_add(thisChunk.optimizesHeight, abs(sprite_height)); // add its sprite height
   }

}
becomes this
Code:
// List of par_optimize objects in the chunk to avoid duplicates
optimizesID = ds_list_create();
optimizesDeactivated = ds_list_create();
optimizesRealX = ds_list_create();
optimizesRealY = ds_list_create();
optimizesWidth = ds_list_create();
optimizesHeight = ds_list_create();

checkAnyChunks = true; // Used to deactivate currently active par_optimize instances on first frame incase they aren't within the active chunk or any views

var oID = optimizesID;
var oD = optimizesDeactivated;
var oRX = optimizesRealX;
var oRY = optimizesRealY;
var oW = optimizesWidth;
var oH = optimizesHeight;

// Go through all par_optimize instances
with(par_optimize) {
   var optimizeID = id;

   // If the sprite rectangle is inside of the chunk then we add its data to the relevant chunk lists
   if(rectangle_in_rectangle(cornerX, cornerY, cornerX+abs(sprite_width), cornerY+abs(sprite_height), thisChunk.x, thisChunk.y, thisChunk.x+thisChunk.width, thisChunk.y+thisChunk.height) > 0) {
       ds_list_add(oID, optimizeID); // add its ID
       ds_list_add(oD, false); // add whether or not it's deactivated
       ds_list_add(oRX, cornerX); // add the top left corner of its sprite x value
       ds_list_add(oRY, cornerY); // add the top left corner of its sprite y value
       ds_list_add(oW, abs(sprite_width)); // add its sprite width
       ds_list_add(oH, abs(sprite_height)); // add its sprite height
   }
}
Not that it really matters in the create event though.

ds_lists can be accessed from anywhere as long as you have it's id.
To get a list from another object, instead of using a with statement or a dot-operator, you can write out the list again before the main loop.
Code:
var controller_list = par_controller.optimizerHandled;
for(var i = 0; i < optimizesSize; i++) {           
      var opID = ds_list_find_value(optimizesID, i);
      var handledAlready = (ds_list_find_index(controller_list, opID) != -1);
}
instead of having to use a with statement just for one object.
And if you really need to refrence another object but can't write the data to local variables first.
Always consider that with statements become more efficient after ~10 lookups of variables in another object compared to dot-operators.

Now, to suggest some other methods, I have 2 in mind.
My own, see here https://forum.yoyogames.com/index.php?threads/efficient-instance-activation-deactivation.40822/
It does not use chunks at all and it's kind of wacky.

Or you can do a straight forward method of
"deactivate everything that just went out of view and activate everything that just went into the view".
It's not easy on the math but it's super efficient.
 
G

GrandFree

Guest
It could be the overhead of using lists and iterating each list within the step event.
Would there be that much of a performance boost if I used arrays instead? Also I've now changed it so it's not done every step.

I'd say there is definetly some slowdown of this being down every step.
Thanks a lot for your tips!

So I've swapped using with statements with local variables, and I've also ensured the begin step event only gets called every 5 frames or so, and also, I've reduced the number of instances being processed by about 80% because there were some "inner room" instances that didn't need to be processed if no one was in any of those rooms. So this caused a huge performance boost...or so I thought.

I'm getting around 300 frames, but my FPS is going down:


I've set my sleep margins to 1, 10, 15, 16, and 20 respectively, doesn't seem to be making a difference. Not sure why my FPS is going below 60 (the room speed).
If I set my room speed to 30 it also goes to around 24-25 FPS.

Any idea?

Also, are you saying with() is slower than the dot operator or faster?
 

Simon Gust

Member
You can try out the debugger and see what exactly is eating your performance in the end.

Depending which version of game maker and what compiler you use it can depend.
For me, dot operators are worth it on a single object until I have more than like 10, at which point a with statement costs less time.

The important thing however is, is that you try to keep both of these methods outside big loops.
 
G

GrandFree

Guest
You can try out the debugger and see what exactly is eating your performance in the end.

Depending which version of game maker and what compiler you use it can depend.
For me, dot operators are worth it on a single object until I have more than like 10, at which point a with statement costs less time.

The important thing however is, is that you try to keep both of these methods outside big loops.
Got it but the thing I don't understand is, why is my FPS going below 60 when fps_real is largely above it? I'm getting around 300 FPS on fps_real.
 

Simon Gust

Member
Got it but the thing I don't understand is, why is my FPS going below 60 when fps_real is largely above it? I'm getting around 300 FPS on fps_real.
I tried looking around the forum on why that is. But I can't seem to find direct and clear answers.
I can say though that it is like that for everyone. fps_real is always higher than fps.

Well, instead of trying to optimize like that, you could also tell us, what exactly those 5000 instances are and what they're used for.
Do they really need to be instances? Are they just graphics?
 

rIKmAN

Member
Would there be that much of a performance boost if I used arrays instead?
Arrays should definitely give you speed boost over using lists, and as Simon said (no pun intended) use the debugger/profiler to see where it's choking and give you some insight as to where you need to optimise.
 

Smiechu

Member
Got it but the thing I don't understand is, why is my FPS going below 60 when fps_real is largely above it? I'm getting around 300 FPS on fps_real.
Becouse fps real refers only the cpu load and doesn't consider what is happening in the graphic pipeline.
Additionally fps is an avarage, you need to consider that your code can generate load "spikes". That means if let's say every 140 frames there is something big happening in the code and it takes time of 4-5 frames, than fps (avarage) will drop.

The only way is the debugger it will show you what is happening. In most cases the graphic pipeline is to blame.
 
G

GrandFree

Guest
Becouse fps real refers only the cpu load and doesn't consider what is happening in the graphic pipeline.
Additionally fps is an avarage, you need to consider that your code can generate load "spikes". That means if let's say every 140 frames there is something big happening in the code and it takes time of 4-5 frames, than fps (avarage) will drop.

The only way is the debugger it will show you what is happening. In most cases the graphic pipeline is to blame.
I've opened the debugger and the only thing I can find relating to Graphics is the "Graphics" tab, but nothing gets displayed:



However from profiling under the "Others" it doesn't seem as though my draw events are taking up a large portion at all of the (CPU?) percentages. I don't believe I'm really doing anything at all GPU-intensive.
 
G

GrandFree

Guest
The second link references something about clicking the refresh button but alas it's greyed out. Only thing I'm able to use is the profiler.

But alas, I'm not doing anything GPU-intensive at all, as also shown when using show_debug_overlay.

I've gotten it to go down to about 20-40 optimized instances activated at a time on average.

You can try out the
Code:
show_debug_overlay(true);
And then read this page
https://docs.yoyogames.com/source/dadiospice/002_reference/debugging/show_debug_overlay.html
I'm honestly not sure what's going on but it seems to be stable now, doesn't dip below 60 or 30 depending on the room speed I set it to.

Looked online and some other people had this issue too. Only answers that came up had to do with sleep margins in the IDE, though I tried changing that as well.

It was so strange, other people claimed it happened sometimes randomly for no reason. If I'm not doing anything GPU-intensive and fps_real is a representation of CPU-usage, then it really shouldn't ever dip below 30 or 60 if I'm doing around 300 frames on average. Makes no sense and I'm completely confused why it was happening before and not now despite not really changing anything, possibly something wrong with the runtime?
 

RangerX

Member
We don't know the nature of your game and what those instance are actually doing. Maybe there's another more efficient way to "make" your game? Just throwing it out there since you couldn't optimise much so far.
And have you tried (ifd the nature of the game lets you) a pure "deactivate when offscreen / activate on screen" system without chunks and whatnot? Like each instance deactivating itself in an outside view event and you main object or camera object activating a region corresponding to the view every couple of steps?
 
G

GrandFree

Guest
We don't know the nature of your game and what those instance are actually doing. Maybe there's another more efficient way to "make" your game? Just throwing it out there since you couldn't optimise much so far.
And have you tried (ifd the nature of the game lets you) a pure "deactivate when offscreen / activate on screen" system without chunks and whatnot? Like each instance deactivating itself in an outside view event and you main object or camera object activating a region corresponding to the view every couple of steps?
The game seems to be fairly optimized now with my chunking method. I've tried using the region functions before but they aren't very efficient and cause quite a bit of FPS drop.

A lot of the performance drops I had before as covered in this thread had to do with overuse of with() and dot operator statements, running the optimization method every step rather than every few steps, and also not filtering what needs to be activated/deactivated properly.

My game basically consists of a large city that contains building objects which create inner room objects which contain objects within those rooms. Going inside a building/room essentially deactivates everything that isn't associated with the room and vice versa when you leave or go to another room. Filtering those objects out gave a huge performance boost.

One tiny problem I had before I did a bit more optimization earlier today was, I was getting around 100 - 300 CPU frames but despite setting my room speed to 60 the fps would drop to around 44-45 and if I set it to 30 it would drop to 24-25. Changed my sleep margins, no difference. Left it for a few hours, came back and it mysteriously went away. Was extremely strange, didn't change anything. GPU wasn't being used that much either. It seems like it's working a lot better now though.

Thanks to everyone for helping with my optimization! I'm getting around 800 - 1000 FPS now.
 

Nocturne

Friendly Tyrant
Forum Staff
Admin
One tiny problem I had before I did a bit more optimization earlier today was, I was getting around 100 - 300 CPU frames but despite setting my room speed to 60 the fps would drop to around 44-45 and if I set it to 30 it would drop to 24-25. Changed my sleep margins, no difference. Left it for a few hours, came back and it mysteriously went away. Was extremely strange, didn't change anything. GPU wasn't being used that much either. It seems like it's working a lot better now though.
When this happens, open task manager and check the CPU usage for GMS2. I've found that on some very rare occasions, GMS2 suddenly starts using up about 90% of the CPU, which means that any games you are testing show a massive drop in FPS due to the CPU being hogged by GMS2. When this happens, save the project close the IDE and then reopen it and it should be solved.
 
Top