• Hello [name]! Thanks for joining the GMC. Before making any posts in the Tech Support forum, can we suggest you read the forum rules? These are simple guidelines that we ask you to follow so that you can get the best help possible for your issue.

Question - Code vertex_submit creating massive unknown overhead

DukeSoft

Member
Hey everyone,

I'm working on a little voxel engine, and I'm running into trouble here. The debugger is not helping me much (actually making things worse).

I have a system that creates chunks filled with blocks - every chunk has its own mesh generated from the blocks.

When rendering 100 of these chunks, the game speed drops down drastically and I do not know why.

Here's the profiler:
upload_2017-10-29_17-2-29.png

Somewhere in the DrawTheRoom there is a process taking up well over 6.4 milliseconds.

The game is running 50fps on a 1060 + 7700k - that shouldn't be happening.

Here's what it looks like:


And here its drawn as a linelist (to show how little vertices there actually are)


I do know the vertex batch is on 104 - But I feel like this is something I can not bring down lower, as vertex_submit has to be called for every chunk.

Where is the massive CPU + GPU overhead coming from? What am I missing? There is just 1 texture being used (2048x2048).

Surely I must be missing something.
 

DukeSoft

Member
I tried combining all chunk meshes into 1 mesh, thus calling only 1 vertex_submit - this was just as slow:


What am I missing?
 
M

Multimagyar

Guest
I'm trying to say something smart here but if you say it's combined in one buffer I kind of have to doubt that it's the vertex_submit unless you still have something funny business in the background.
 

DukeSoft

Member
what are you doing in the step event of the chunks??
Nothing, as you can see in the debugger.

The only thing thats happening is the vertex buffer being submitted. Its frozen, and it has 126840 vertices in it.
The entire model uses 1 2048x2048 texture with UVmapping per block.

Here's the entire profiler of the game running:

upload_2017-10-29_20-45-51.png
upload_2017-10-29_20-46-5.png
upload_2017-10-29_20-46-23.png

And it looks even weirder in the GML only overview;
upload_2017-10-29_20-46-55.png

There is something that is not GML that's eating up my CPU and GPU :/
 

DukeSoft

Member
The obj_chunk draw event had 1 line in it (a comment) - removing that gained 10fps. (was 50, now is 60. still waaaay too slow)

EDIT:
This is the mesh drawing function, initially it draws the entire mesh, if I press space it creates 1 big mesh and the vertex buffer drops to 5 (in the debug window), FPS remains the same.

Code:
if (mainchunk == -1) {
    with (obj_chunk) {
        vertex_submit(mesh, pr_trianglelist, global._blockTexture);
    }
} else {
    vertex_submit(mainchunk, pr_trianglelist, global._blockTexture);
}
 

Roa

Member
any way you can avoid the "with" statement? Im not entirely sure if thats whats slowing it down.


nvm, PM me a copy if you want or something, Ill see if I can dig at it. I'm not seeing it here
 

DukeSoft

Member
The with statement is not slowing it down for sure, also there is no speed difference when not using the with().

mainchunk will become a vertex buffer once I press space, so the with is never even executed. Also the with isn't really a slow feature :)
 

Mike

nobody important
GMC Elder
Vertex submit pretty much just sends the buffer to the hardware, so there's not a lot we can be doing to get in the way.

Have you frozen the buffers?
 

DukeSoft

Member
oh my god, I did not know instances (WITHOUT STEP / DRAW EVENTS, ONLY CREATE AND USER0) draw THAT much freaking power.

Guess what a with (obj_block) {instance_destroy()} did?


They were empty objects, only holding some information about the block. I'm moving over to some data store for block storage instead of instances :O
 
  • Like
Reactions: Roa

Roa

Member
How many instances? :p And yeah, that red bar implies there was a lot of step event cycling regardless of whether you used it for code. Just having objects laying around adds to that pile of CPU time to check if things need doing. And user events are slow too I believe.
 

Mike

nobody important
GMC Elder
empty (as in no events/code) - deactivated - instances should draw no time. You can't find them using a with() and they are designed for storage only. So perhaps deactivate anything you want to only store data in.
 

DukeSoft

Member
10x10x8x8x8

thats 10x10 chunks, each holding 8x8x8 instances. That would be 51k instances.

The weird thing is they have no sprite, no visibility, no collision mask, no step events, no nothing. Just user0 and create. I thought GM didn't execute step events for those kinds of instances. Turns out it does :D But it doesn't show in the debugger - probably because the game loop system just has to loop through all 51k instances and check if they have a step event..

Also I am quite sure that user events are not slow at all. I'm quite sure they behave just as fast as scripts.
 

DukeSoft

Member
empty (as in no events/code) - deactivated - instances should draw no time. You can't find them using a with() and they are designed for storage only. So perhaps deactivate anything you want to only store data in.
Aha. I thought deactivated instances are no longer reachable.
- Are the Instance ID pointers still available as accessors for deactivated instances?
- Does instance_exist() return true for deactivated instances?

I'll check it out.

Still find it remarkable that 50k instances with just a create event draw that much processing power.
 

Mike

nobody important
GMC Elder
As long as you keep the ID of it lying around, you can still reference it like "INST.VAR=???" etc.

It won't take any time in events as they simply aren't included in lists they don't have events for, but if your doing a "with" then it'll find all of them....because they're active.

not sure on instance_exists(). if you use the ID, I'd say yes.... if you use the object.... not so sure, you'll have to check to see.
 

DukeSoft

Member
Thanks for the information Mike. It is indeed still reachable and I don't know either if "instance_exists()" worked.

I ended up going with arrays and using array pointers as "ID". Thats going to be a lot faster and save a lot of memory. Its a little more unclear than using instances but its OK for now.
Cant wait until GMS2 supports the "lightweight objects" that are on the roadmap!
 
Top