Optimizing the draw event for thousands of objects.

CraterHater · Jan 4, 2021

Hey,

I've got a game where ideally I need to draw many thousands of objects. Here is my draw event;

GML:

if(current_health != max_health){
    event_user(1);
}

draw_sprite_ext(sprite_index, image_index, draw_coord_x, draw_coord_y, draw_scale_x, draw_scale_y, 0, c_white, alpha);

This event alone takes up 1.721milliseconds every tick for 600 objects. It says the draw_sprite_ext function only takes up 0.274 and the event_user(1) is never called because current_health always equals to max_health in my tests.

Does anyone know why this takes up so much of the performance? I tried substituting the draw_sprite_ext function with a simple draw_self and this resulted in a drastic increase in performance but I still need the function. Any help would be massively appreciated!

EDIT: I've heard about vertex buffers but I am not sure how they work. Is that a good way to go for me?

Yal · Jan 4, 2021

Vertex buffers give massively better performance if both of the following are true:

They're fully static (e.g. things doesn't ever move around) so you can "freeze" them (which loads them onto the GPU once and then never touches them again, instead of you needing to send all the data over every frame)
Everything you draw uses the same texture

If these aren't true, you're stuck with having to optimize your current code - vertex buffers isn't the right solution for this.

One thing that might impact performance is that you interpret code for 600 objects, more interpreted code is slower. If everything you want to draw are children of the same parent object, you could get away by having one interpreted loop in a control object:

GML:

with(parent_drawableEntity){
  if(current_health != max_health){
    event_user(1);
  }
  draw_sprite_ext(sprite_index, image_index, draw_coord_x, draw_coord_y, draw_scale_x, draw_scale_y, 0, c_white, alpha);
}

Note that interpreted for loops are slow too, so this really only should give a noticeable speed benefit if you can do it all in one with loop - multiple with loops (e.g. if you have several independent object categories instead of one big overarching parent) might be enough to eradicate the benefits, but you probably should profile this to make sure.

CraterHater · Jan 4, 2021

Yal said:
Vertex buffers give massively better performance if both of the following are true:

They're fully static (e.g. things doesn't ever move around) so you can "freeze" them (which loads them onto the GPU once and then never touches them again, instead of you needing to send all the data over every frame)

Everything you draw uses the same texture

If these aren't true, you're stuck with having to optimize your current code - vertex buffers isn't the right solution for this.

One thing that might impact performance is that you interpret code for 600 objects, more interpreted code is slower. If everything you want to draw are children of the same parent object, you could get away by having one interpreted loop in a control object:

GML:

with(parent_drawableEntity){ if(current_health != max_health){ event_user(1); } draw_sprite_ext(sprite_index, image_index, draw_coord_x, draw_coord_y, draw_scale_x, draw_scale_y, 0, c_white, alpha); }

Note that interpreted for loops are slow too, so this really only should give a noticeable speed benefit if you can do it all in one with loop - multiple with loops (e.g. if you have several independent object categories instead of one big overarching parent) might be enough to eradicate the benefits, but you probably should profile this to make sure.

Okay thanks. So if Vertex Buffers are not the solution for this and the Interpreted Loop only gives a tiny improvement how would one ever go about drawing say 2000 objects? Is this simply not possible with GMS? Thanks.

EDIT: It looks like such an interpreted loop causes issue with layers. How would I fix those?

Simon Gust · Jan 4, 2021

CraterHater said:
Okay thanks. So if Vertex Buffers are not the solution for this and the Interpreted Loop only gives a tiny improvement how would one ever go about drawing say 2000 objects? Is this simply not possible with GMS? Thanks.

2000 should be possible with a simple draw call such as a single draw_sprite(). If that is all that is in the draw event, then you should be fine.
You say 1.7 milliseconds for 600 objects? That's expected. Don't worry so much about performance where you can't really optimize, you have 16.67 milliseconds avaliable before dropping below 60 fps.

CraterHater · Jan 4, 2021

Simon Gust said:
2000 should be possible with a simple draw call such as a single draw_sprite(). If that is all that is in the draw event, then you should be fine.
You say 1.7 milliseconds for 600 objects? That's expected. Don't worry so much about performance where you can't really optimize, you have 16.67 milliseconds avaliable before dropping below 60 fps.

Apparently I was also using the VM compiler. After switching to the YYC compiler the performance is still good even at 1500 objects with complex events and all. True saving grace that was haha.

Optimizing the draw event for thousands of objects.

CraterHater

Member

Yal

🐧 penguin noises

CraterHater

Member

Simon Gust

Member

CraterHater

Member

Optimizing the draw event for thousands of objects.

CraterHater

Member

Yal

🐧 *penguin noises*

CraterHater

Member

Simon Gust

Member

CraterHater

Member

🐧 penguin noises