Legacy GM Can you use a shader, and the GPU, to do calculations that are used by the CPU? (studio 1.4)

the_dude_abides · Apr 26, 2020

I know that using a shader does allow for offsetting mathematical processes onto the GPU, so that vertices can be translated, scaled and rotated etc. The reason for doing this is that it removes the strain from the CPU on the graphics side.

But what I'm wondering is whether that information, having been calculated by the gpu, can be returned in any way to the cpu for processes that are not graphical, or necessarily able to be implemented through the gpu.

As an example:
I am filling an mp grid where the cells are a pixel, and it will store the path of an object, including the area the objects size will cover as it travels. Some trigonometry is used, and then some basic maths. Nothing super advanced, but when applied to all the objects it adds up to being moderately costly.

Having experimented with other ways of doing this (like drawing the shapes all of the paths make to a surface, then putting that onto an object, and then the object into the grid) it is the least costly way I've found. It would seem to be the kind of thing that could be handled by the gpu, as it is maths calculations, but I can't mix and match the two sides in GLSL (?)

1) Could a shader actually set cells on the grid? I am unaware if any GML specific commands work within Studios implementation of GLSL
2) If not accessible directly, perhaps due to being different languages, or the gpu cannot communicate with the cpu in such a fashion, can the data calculated by the gpu be returned to the cpu in any way? Maybe as an array holding the positions of all the cells to be marked as occupied?

The cost of my method is heavy enough that, if I can get any part of it being handled elsewhere, the result would be very useful. Can it be done?

❤️×1 · Apr 26, 2020

the_dude_abides said:
Could a shader actually set cells on the grid? I am unaware if any GML specific commands work within Studios implementation of GLSL

Nop

the_dude_abides said:
If not accessible directly, perhaps due to being different languages, or the gpu cannot communicate with the cpu in such a fashion, can the data calculated by the gpu be returned to the cpu in any way? Maybe as an array holding the positions of all the cells to be marked as occupied?

Yep... as an image! I'm using this method for perlin noise but, honestly, gpgpu doesn't really make sense in GML for various reasons (starting by how slow it can be to read back to result from said image).

altarar · Apr 26, 2020

I used to use NVIDIA's CUDA for GPU programming and there are 2 ways to manipulate any kind of data - a) you must allocate unified memory which can be shared by both CPU and GPU or b) you must allocate memory for each device separately and copy the result. Since there are no functions that provide such functionality I guess there is no way to use GPU to solve CPU-like problems. You can try to write a DLL but you would be limited to use only buffers.

NightFrost · Apr 26, 2020

Technically, you could set up some type of maths in the shader and have it draw to a surface. Then on GMS side you'd poke at individual pixels and try to intepret the RGBA values into results. This pixel-poking sounds awfully slow though (not that I've tried however) so it probably cancels any speedups you'd get from GPU's parallel processing. Not to mention you'd have to jump through some hoops setting up the whole thing, as the only thing that differentiates one pixel (and math process) from from the next would be their positioning.

the_dude_abides · Apr 26, 2020

❤×1 said:
Yep... as an image! I'm using this method for perlin noise but, honestly, gpgpu doesn't really make sense in GML for various reasons (starting by how slow it can be to read back to result from said image).

If what you're getting at is essentially sweeping across a surface whilst checking the pixels for colour (black is full, white is empty, or some such), then, yeah! I know that is super slow, and my current way is better. The grid is 500 by 500, so looping over a surface of that size and getting the pixel colour is too costly.

Mp_grid_add_rectangle is the cheapest option to enter the info, and by calculating where the objects mass would cover as it travels (its bounding box, at least) I can do a single loop that fills in that line of positions, moves up or down a pixel, and repeat. Which is just 2 or 3 bits of trig, then some repetition of basic maths, and a repeat loop.

Based on tests I did it is about 4 times faster than anything involving an image when inputting the data, and even with the cost of then looping through the 500 by 500 positions whilst seeing if they are free or not, it is still much less expensive. Which would still have to happen after reading the surface anyway, so it all gets much too costly (surfaces being volatile might need redoing, and clearing an mp grid seems cheaper - there's other hidden costs / reasons)

It really has to be simpler data, and more quickly accessed than that. But I guess from your response and altarars', that it isn't possible.

@altarar
Unfortunately writing a DLL is somewhat beyond my capabilities. Unless anyone else knows of a way to do this using the gpu, such as you can send it an array - but you can't rewrite it and send it back to the cpu?, then I suppose
other means must be explored.

the_dude_abides · Apr 26, 2020

NightFrost said:
Technically, you could set up some type of maths in the shader and have it draw to a surface. Then on GMS side you'd poke at individual pixels and try to intepret the RGBA values into results. This pixel-poking sounds awfully slow though (not that I've tried however) so it probably cancels any speedups you'd get from GPU's parallel processing. Not to mention you'd have to jump through some hoops setting up the whole thing, as the only thing that differentiates one pixel (and math process) from from the next would be their positioning.

Thanks. I have tried reading surfaces and it is a lot slower than the pure maths approach. It's a bit annoying that I have the means for offloading the maths, but not the means to apparently return it

sp202 · Apr 26, 2020

While I doubt using a shader for this is a good idea, there are other ways you can optimize. Consider doing the calculations for only a few instances each step.

the_dude_abides · Apr 27, 2020

sp202 said:
While I doubt using a shader for this is a good idea, there are other ways you can optimize. Consider doing the calculations for only a few instances each step.

Are you saying it's not a good idea because any way of returning the data to the cpu would require that much effort to "translate" it, that any gains would then be lost? You don't seem to be saying it's impossible....is that what an earlier comment referred to as GPGPU? You can only send data back to the cpu as modified texture data?

So far I've been informed it's not doable, or that it's impractical (my take on your comment), and yet I've read that there are compute shaders. The name seems to suggest what I'd like to have happen (solely working on transforming data), though that may be me not understanding what they do, or GMS is incapable of doing the same which renders their existence a moot point.

I just find it vexing that you have a piece of hardware designed for these processes, is supposedly much more capable of doing them than the cpu is, and I am using software that makes use of that hardware. But it apparently lacks the functionality? I get what you're saying about staggering my method, though having to do that whilst there is plenty of untapped power in the gpu going to waste would be a disappointing result.

sp202 · Apr 27, 2020

GM doesn't have access to compute shaders, shaders in GM can only communicate back to the CPU with textures. Also GPUs aren't inherently better at maths, they're better at repetitive maths that can be done in parallel while a CPU is generally faster at doing sequential maths, what you're doing might not necessarily be suited to that.

❤️×1 · Apr 27, 2020

the_dude_abides said:
GPGPU

G(eneral)P(urpose)G(raphical)P(rocessing)U(nit) -> using the GPU to perform non-specialized calculations that would typically be conducted by the CPU.
Can be done in multiple ways, ie : if it's through the shader pipeline, it's compute shader. But it can also be done through specialized API like OpenCL... Anyway, none is usable natively in GMS and it'd be a LOT of work to implement them.
Believe you me : you'd be better of spending this time to find a workaround your problem or to write a DLL to deport your calculation to a second, non blocking, thread.

the_dude_abides · Apr 27, 2020

@sp202
Sorry to keep picking your brain. Because some of this (well, most of this) is beyond my current understanding, I will just add a bit more detail to what I want to do:

It's a top down shooter where enemies will want to find places that are not in collision with other enemies, or in the path that they are travelling (the whole area that they cover as they move, plus the dimensions of the object looking for a free space)

I'm using an mp grid for two reasons:
(1) I can understand the maths to figure out where to fill the cells, and doing so is less costly than any alternative involving collision detection or other attempts at doing this that I've tried
(2) Checking if a cell is free by looping through the area covered by the grid, which when I tested it gave faster results than doing the same with an array holding the same data

My tests are perhaps incorrect, but using the grid showed much better results than storing the occupied positions in any other kind of data storage. It's the quickest way that I've found, though it's still not quick. Off the top of my head, and bearing in mind this is "optimized" as much as I can, it can do this for maybe 6 enemies a second and there will be 30 or so of them. I'd rather it didn't take 5 seconds for them to all respond, and hog up everything else on the cpu while they're at it, which is presumably why I see regular small frame drops from 60 fps even though this process is currently all that's running.

Is this what you referred to as "sequential", in that it does one enemy at a time and can only perform the maths for it's process before moving onto the next? I'm not very clued up, but is the fact that it is writing this data to one place (the grid) what limits it to being sequential? Like: it could be done in parallel, but only if the returned data from each instance is written to separate storage (texture data for each one?) and then collated afterwards?

I'm not wedded to my current method, but can only stick to what I know how to do, and shaders are not what I know how to do

❤×1 said:
just quoted you because I can't figure out how to @ your user name

Just wanted to say thanks for your input

NightFrost · Apr 27, 2020

One way GPUs generate speed is by processing a bunch of pixels in parallel. Meaning, they are literally processed at the same time in the GPU's units. A sequential process would be where next step depends on the result of previous step. Obviously this can't be parallelized, and the GPU would have to be called once for every step of the sequence.

As for the problem you're trying to solve, it sounds like you're trying to create collision avoidance for your enemies so they don't stack. Trying to do this based on pixels is highly inefficient. You should look into steering behaviors, which is one way of creating dynamic avoidance into pathfinders. A single-sentence explanation of steering behaviors would be, it calculates forces to influence movement of entities, where their target pulls them, and obstacles repel them based on their relative distances. Their pathfinding can further be assisted with a flow field if the map has lots of obstacles, especially if the map is grid-based.

the_dude_abides · Apr 28, 2020

NightFrost said:
One way GPUs generate speed is by processing a bunch of pixels in parallel. Meaning, they are literally processed at the same time in the GPU's units. A sequential process would be where next step depends on the result of previous step. Obviously this can't be parallelized, and the GPU would have to be called once for every step of the sequence.

Okay, thanks! I'm beginning to understand it now.

As to your other suggestion: I have seen examples of steering behaviours on the community, but didn't think it was what I'm looking for. However, I never actually tested it.

I should point out that the only obstacles on the "map" are other enemies, bullets and projectiles. Calculating the paths of these things as they travel is the only way I could think of to have them able to move to a place where they would not be in collision. Cells on the grid are a pixel in width and height, so it is only a container for this data and the most convenient way for me to do this.

I'll take a look at your suggestion and see if it has better performance. Cheers

Legacy GM Can you use a shader, and the GPU, to do calculations that are used by the CPU? (studio 1.4)

the_dude_abides

Member

❤️×1

Member

altarar

Guest

NightFrost

Member

the_dude_abides

Member

the_dude_abides

Member

sp202

Member

the_dude_abides

Member

sp202

Member

❤️×1

Member

the_dude_abides

Member

NightFrost

Member

the_dude_abides

Member