• Hey! Guest! The 37th GMC Jam will take place between May 28th, 12:00 UTC and June 1st, 12:00 UTC. Why not join in! Click here to find out more!

SOLVED Optimizing terrain z value calculations

Bart

WiseBart
Hi all.

The last couple of days I've been working on an optimized way to calculate several z values on a terrain.
For this I'm using ds_grids since they give you two free loops that are implicit and incredibly fast.
Each instance that needs a z value at a specific (x, y) coordinate gets an index in the grid. An instance in this case is not a GM instance but rather just one (x, y) input that needs the resulting z value as the output.

The basic calculation is working and already gives a great speed improvement compared to looping manually and doing a couple of dot_product_3d's and the likes.
But I'd like to try optimize this even further yet at the moment I reached a bottleneck and I'm not sure if this is something that is even possible to find a solution for.

Right now I execute the following script in the Begin Step event for all instances:
GML:
/// @func phy_terrain_instance_set_position(index, x, y)
/// @desc Update a given instance
#macro grid_height 9

var index = argument0;
var _x = argument1 * one_over_tile_size,
    _y = argument2 * one_over_tile_size;

var dx = frac(_x), dy = frac(_y);
var tri_index = (dx >= dy);                    // Faster than grid/array lookup!
var start = grid_height*tri_index;

var grd_values = arr_terrain_data[ _x, _y];
ds_grid_set_grid_region(grd_instance_data, grd_values, 0, start, 0, start+(grid_height-1), index, 0);
grd_instance_data[# index, 1] = dx;
grd_instance_data[# index, 2] = dy;
After doing this for all instances, the controller calculates all z values using a couple of ds_grid_add_grid_region's and ds_grid_multiply_grid_region's in the Step Event.
In the End Step event, each instance can read its z value.

The goal here is to move all of the above code into the ds_grid calculations, optimizing the thing even further.
Basically just put in (x, y) (not multiplied by one_over_tile_size) in the ds_grid for each instance, lookup the right grid data in arr_terrain_data and let the ds_grid calculations handle the rest.
In the end, the modified phy_terrain_instance_set_position script should look like this:
GML:
/// @func phy_terrain_instance_set_position(index, x, y)
/// @desc Update a given instance
#macro grid_height 9

var index = argument0;
var _x = argument1 * one_over_tile_size,
    _y = argument2 * one_over_tile_size;

var grd_values = arr_terrain_data[ _x, _y];    // One lookup, grd_values contains data of both triangles (the calculation turns into something like "terrain.z + 1 * effect_of_tri0 + 0 * effect_of_tri1", etc.)
ds_grid_set_grid_region(grd_instance_data, grd_values, 0, 0, 0, grid_height-1, index, 0);
grd_instance_data[# index, 1] = _x;            // Obviously these values must be set somewhere
grd_instance_data[# index, 2] = _y;
I can get the multiplication by one_over_tile_sizeright, get the integer and fractional parts of x and y using the ds_grid functions but the main issue is obviously getting the triangle index, since it requires some way of rounding/flooring the calculated values (I can do dx+(-1*dy) but that value still needs to turn into 0 or 1 somehow).

I've been breaking my head over this the last couple of days but I don't see a solution.

The easiest thing would be if the grid functions were extended with some ds_grid_script_grid_region that allowed the execution of a custom script for each ds_grid cell/region but I don't think that's something that would be added quickly.

I hope I've been able to explain this a bit clearly :)

Anyone who has ideas or suggestions?

EDIT - Potential solution: This may actually turn out to be more straightforward than I thought.
The only values that are really needed are the x and y position and the values of the triangle that position is in.
Now there is simply no way to round values once we're doing the calculations using the ds_grid so everything needs to be done in the array lookup.
Currently I multiply the x and y coordinates by one_over_tile_size so the coordinates in one single square end up ranging from 0 => 1.
But that doesn't help to get the triangle index right, since we now need the fractional parts of the coordinates (performance cost: 2 calls to frac).

What if we don't scale the coordinates and increase the size of arr_terrain_data?
For each world (x, y) position we can now link the correct ds_grid. One grid for all indices of the top-right triangle, one grid for all indices of the bottom-left triangle.
The (x >= y) condition becomes implicit this way.

GML:
// Assume that grid index 9 contains the top-right triangle's values
// and grid index 8 contains the bottom-right triangle's values
// Non-integer values of x and y may give issues near the diagonal
 x 0 1 2 3
y
0  9 9 9 9
1  8 9 9 9
2  8 8 9 9
3  8 8 8 9
Now there's one edge case, being the diagonals. The squares there belong to both the bottom-left and top-right triangle.
We may still be using the values of the bottom-left triangle when we crossed the diagonal and are in the top-right triangle.
But I guess that could be solved by adding a 'precision' value so the actual array size may be a multiple of the terrain dimensions.

Going to try implement this!
 
Last edited:

Bart

WiseBart
All right. A small update on this.

I haven't continued working on that previous idea since it makes calculating the right values near the diagonals harder and requires a lot more memory for the array that stores the coefficients.
The current version is therefore quite optimized already.

Another idea that I've been considering is moving the calculations to the GPU.
The condition is that it should work in GLSL ES to keep things compatible with all platforms.
When doing it this way, there are likely two options:
  • Use textures to store the information. Use buffer_get_surface/buffer_set_surface to set values and retrieve values after doing the drawing.
    GM doesn't support floating point textures, however, so not too sure if that's a doable option.
  • Assume that we don't actually need the z coordinate on the CPU and make it a purely visual effect.
    Send in the terrain z values using a uniform array (GM doesn't support vertex texture fetch or how's it called) and work out the offset for all vertices in the shader.
    Thing here is that the size of a uniform is limited. So terrain size and number of instances will be pretty limited.
I consider the second GPU idea to be the most useful addition. It could work well for those things on screen that don't need any interaction with the rest of the world aside from the terrain.

I'll mark this post as solved since I think these are about all the possible options there are.
 

Joe Ellis

Member
Hi, I know this is solved but I wanted to share this method with you in case it's more efficient:

Create a grid of the terrain, where each cell contains the two triangles of each quad of the terrain
Each "triangle" is an array containing the xyz of the 3 vertices, and the normal
Get the cell the instance is currently in
Check which of the two triangles the instance is in
Then get the z point


Code:

GML:
cell = grid[clamp(floor(x / grid_res), 0, grid_width), clamp(floor(y / grid_res), 0, grid_height)]
t = cell[0]
if !point_in_triangle(x, y, t[_tri.x1], t[_tri.y1], t[_tri.x2], t[_tri.y2], t[_tri.x3], t[_tri.y3])
{t = cell[1]}
z_pos = z + (dot_product_3d(t[_tri.nx], t[_tri.ny], t[_tri.nz], t[_tri.x1] - x, t[_tri.y1] - y, t[_tri.z1] - z) / t[_tri.nz])
Broken down in steps:

Get the dot product of the triangle's normal and the vector from the instance to the 1st vertex of the triangle:

d = dot_product_3d(t[_tri.nx], t[_tri.ny], t[_tri.nz], t[_tri.x1] - x, t[_tri.y1] - y, t[_tri.z1] - z)

This value gives the distance from the instance to the nearest point on the triangle,
which is perpendicular to the triangle's surface like it's normal,
so then you can multiply the triangle's normal by this distance to project the instance's position onto the triangle:

pos_on_triangle = instance_pos - (triangle_normal * distance)

This normally works fine for spherical collision,
but if the triangle is sloped it will not give a position that's directly under the instance
and so the z coordinate will not be on the triangle's surface.

So you need to make up for the extra distance, which you can do by increasing the distance based on how steep the triangle is,
which you can do by simply dividing the distance by the triangle's z component of it's normal

So then you can do:

z_on_triangle = instance_z + (distance / triangle_normal_z)

and that's it,
This is a really light collision method
and if each instance only has to do this once per step it'll be incredibly light

You could make it even lighter by precalculating the reciprocal of the triangle's normal z (1 / nz)

and also, you might be able to pre calculate the edge vectors of the triangle to make a faster point in triangle method,
but I haven't tested whether that's faster than the built in point_in_triangle.

But yeah, just wanted to share this cus it might be helpful, aswell as for other people reading this thread
 

Bart

WiseBart
That does indeed look like a great way to do it. Thanks for sharing!

I've been pre-calculating the slopes (z2-z1) and (z3-z1) for each triangle and multiplied those by the x and y values (the offset from the top-left corner of the square) to get the result.

My main idea is to use ds_grids to calculate everything since they're really good for calculations.
But the real bottleneck of this method appears to be the grid reads/writes. Those are terribly slow, while the actual calculations are incredibly fast.

You can remove those calls to floor since GM internally floors the values you pass when indexing data structures.

Any specific reason why you're using point_in_triangle instead of a check like (dx >= dy)?
 

Joe Ellis

Member
Yeah I was just using point_in_triangle cus I hadn't worked out the method you're using,
so thanks for pointing that out, and for the thing with not needing to floor the values.

I haven't really got any other ideas to do with the grids, I try not to use grids or 2d arrays anymore and use one long 1d array with [x + (y * grid_height)], I started doing that cus you can then initialize a large "grid" in one go and it's way faster than initializing each column separately, but ds grids might be generated like that anyway.
 
Top