GameMaker Custom A* pathfinding + variations (HPA*, JPA etc)

vdweller

Member
Hi all,

This is mainly a discussion/presentation of ideas thread. There are lots of good pathfinding concepts and I am not pretending to know all about them. Hopefully this thread will grow to a meaningful exchange of knowledge. I will begin my approach from the basics and move up to more complicated concepts in future posts.

The past few days I have began prototyping a grid-based pathfinding system.The idea originated from Rimworld, which uses a fairly large map and tens of agents, however I have added some more goals to spice things up:
  • Multiple grids of variable size and relative locations to each other. Agents will be able to "jump" from grid to grid if they are close enough. Imagine something like jumping from an asteroid to a nearby one with your jetpack.
  • Portals/teleporters and/or muti-floor (like subterranean levels below ground)
  • Variable-cost terrain
  • Dynamically changing obstacles
  • Speed-efficient. Ideally, agent pathfinding search should be queued (ie only a certain agent(s) during a step can calculate a new path)
  • 8-directional movement
I began with implementing the basic A*. Since speed is more of an issue than memory, I opted for creating a node for each grid cell and precalculating its neighbors. Re-adjusting node neighbors when adding/removing an obstacle is easy and cheap to do in real-time. I tried some other approaches in order to minimize hand-written iteration code. For this purpose, I took the following measures:
  • Each node has a "list" flag indication which list it belongs to: Open, Closed or None.
  • Used a priority list along with a normal list for the Open nodes. In this sense, the Open list is probably redundant but it helps when checking which list a node belongs to, since lists have always different indices, plus the Open list contents can be used later for cleanup/resetting explored nodes. I haven't tested if it will be as fast to cleanup from a priority list but I have my doubts.
With these steps, hand-written iteration is nonexistent (save for cleaning up which has a minimal impact in total run time anyways).

The basic node creation code is this:
EDIT: Code updated Jul 14 2019.
Code:
/*
Argument0: grid id
Argument1: grid i
Argument2: grid j
*/

var array;
array[c_pfcost]=1;
array[c_pfgrid]=argument0;
array[c_pfi]=argument1;
array[c_pfj]=argument2;
array[c_pfg]=0;
array[c_pfh]=0;
array[c_pfparent]=-1;
array[c_pflist]=-1;
array[c_pfneighbor]=ds_list_create();
array[c_pfneighbordist]=ds_list_create();

ds_list_add(global.list_node_created,array);

return array;

And the pathfinding code is this:
Code:
/*
Argument0: path list id
Argument1: start node
Argument2: end node
Argument3: (optional) left border
Argument4: (optional) top border
Argument5: (optional) right border
Argument6: (optional) bottom border
*/

var list_inode=argument[0];
var node_start=argument[1];
var node_end=argument[2];
var border_left=-infinity;
var border_right=infinity;
var border_top=-infinity;
var border_bottom=infinity;
if (argument_count>3) then {
    border_left=argument[3];
    border_top=argument[4];
    border_right=argument[5];
    border_bottom=argument[6];
}
var stack_cleanup=global.stack_cleanup;
var plist_open=global.pf_plist_open;
var endx=node_end[c_pfi];
var endy=node_end[c_pfj];
var list_child=global.pf_list_nodechild;
var list_childdist=global.pf_list_nodechilddist;
var tt=0,t1,t2;
var i,nodelist;
var node,node_current;
var size,tg,h;
var psize=1;

node_start[@c_pflist]=c_openlist;
ds_stack_push(stack_cleanup,node_start);
ds_priority_add(plist_open,node_start,0);
 
while (psize>0) {
    node_current=ds_priority_delete_min(plist_open);
    psize+=-1;
    if (node_current==node_end) then {
        while (node_current<>-1) {
            ds_list_insert(list_inode,0,node_current);
            node_current=node_current[c_pfparent];
        }
        pf_find_cleanup();
        exit;
    }
 
    node_current[@c_pflist]=c_closedlist;

    //pf_getneighbors(node_current,border_left,border_top,border_right,border_bottom);
    list_child=node_current[c_pfneighbor];
    list_childdist=node_current[c_pfneighbordist];

    var xx,yy;
    size=ds_list_size(list_child);
    for (i=0;i<size;i+=1) {
        node=list_child[|i];
        xx=node[c_pfi]; yy=node[c_pfj];
        if (xx>=border_left && xx<=border_right && yy>=border_top && yy<=border_bottom) then {
            nodelist=node[c_pflist];
            if (nodelist==c_closedlist) then {
                continue;
            } else {
                tg=node_current[c_pfg] + (node[c_pfcost]*list_childdist[|i]);
                if (nodelist==c_openlist) then {                
                    if (tg<node[c_pfg]) then {
                        node[@c_pfg]=tg;
                        node[@c_pfparent]=node_current;
                        //h=max(abs(xx-endx),abs(yy-endy));
                        //ds_priority_change_priority(plist_open,node,(tg+h));
                    }
                } else {
                    node[@c_pfg] = tg;
                    h=max(abs(xx-endx),abs(yy-endy));
                    node[@c_pfh] = h;
                    node[@c_pfparent] = node_current;
                    node[@c_pflist]=c_openlist;
                    ds_priority_add(plist_open,node,(tg+h));
                    ds_stack_push(stack_cleanup,node);
                    psize+=1;
                }
            }
        }
    }
}

pf_find_cleanup();

Node resetting script (pf_find_cleanup):
Code:
var stack_cleanup=global.stack_cleanup;
var plist_open=global.pf_plist_open;
var node;

while ds_stack_size(stack_cleanup)>0 {
    node=ds_stack_pop(stack_cleanup);
    node[@c_pflist]=-1;
    node[@c_pfparent]=-1;
}

ds_priority_clear(plist_open);

I have used the Euclidean heuristic. The square root may seem slow, but contrary to what certain websites state, using the "square root free" version can actually botch the path since g and h will then be expressed in different metrics. In addition, the slow part will always be the large amount of explored nodes, calculating a cheaper heuristic doesn't make much of a difference. Finally, I want the very same algorithm to be used in HPA* (discussed later), where the Euclidean heuristic works best due to the spreading of abstract nodes.

upload_2019-3-13_23-40-20.png

All tests are run in YYC. As you can see, the algorithm takes ~67 milliseconds on a 100x100 grid to reach a faraway point. The turquoise grid and blue dots you see in the screenshot are used in HPA*, ignore them for now. Also, The path list omits the last (goal) node for now, I will fix that later.

Obviously, 67 milliseconds for just a 100x100 map seems like a lot of time, it's definitely prohibitive for realtime calculation. Even if we split that up in multiple steps, it would still take some frames for an agent to start moving. I am trying to avoid such an approach if possible.

EDIT (Jul 14 2019): Same path is calculated in ~29 ms with the new code.

So I will leave you with this first post and add some questions for discussion:
  • Time-wise, is the algorithm I wrote efficient? 67 milliseconds looks like a lot, I suspect that e.g. in C++ it would be a lot less. Is my assumption true? Is YYC inherently slower, or have I made some undetectable mistake?
  • Do you have to propose other optimisation concerning the "Vanilla" A* algorithm? I am not talking (yet) about other hybrid methods like hierarchical A* or Theta or JPS, I am still talking about the "bare-bones" A* code.
  • If you have written your own A* pathfinding code, what were your own times, and how did you achieve them? Again, since this post covers only the A* approach, try to stick with A* for now.
 
Last edited:

TheSnidr

Heavy metal viking dentist
GMC Elder
I'm working on my own implementation of A* as well, and your code isn't very different from mine. I'm far from an expert in this field, I've pretty much only used the Wikipedia page on A* to make my solution, but I think I've found a few ways to make the algorithm faster.
Our implementations are similar, but the differences I can spot are as follows:
  • You remove any info you've created during the pathfinding, after the algorithm is done. My implementation only clears the necessary information before starting the algorithm, potentially saving a lot of processing. You only really have to reset the parent, gscore and fscore of the first node you add to the open priority data structure. Previous gscore only matters when a node has already been added to the open set, and the gscore is overwritten when writing the node to the open set, so the previous value before it was in the open set doesn't really matter.
  • You use a ds_list to figure out which set the node belongs to. I use a ds_map for the closed set. If the node exists in the closed ds_map (ie. if closedSet[? nodeID] is not undefined), the node has already been checked and the loop can continue. Checking whether or not a value exists in a ds_map is very fast. Doing it this way will also make it easier to use the method in the previous point, since you won't have to manually reset the nodes afterwards. You can simply clear the ds_map instead.
  • I don't entirely understand what pf_getneighbors does, but that seems like it could slow down the algorithm considerably. If memory management isn't critical, I'd rather recommend keeping a list of available neighbours in the node structure itself.
Since our maps differ vastly, and we use different computers, this cannot be used for direct comparison, but it took the YYC 21 milliseconds to find this path:


When using the square of the distance, rather than the actual distance, for the heuristic, the pathfinding is sped up by a factor of four, and a similar but slightly more erratic path is found in 5 milliseconds:



You could post a benchmarking executable of your algorithm so that I could test how the speed of your implementation.

I'm planning to release my A* implementation soon, but for now, here's my pathfinding script in its entirety:
Code:
/// @description navmesh_find_path(navMesh, x0, y0, z0, x1, y1, z1, exact)
/// @param navMesh
/// @param x0
/// @param y0
/// @param z0
/// @param x1
/// @param y1
/// @param z1
/// @param exact
/*
    An implementation of the A* algorithm
    Returns an array containing the 3D positions of all the pathpoints, including the start and end positions.
 
    The start and end positions will be clamped to their respective nearest triangles.
    This assumes that the player can move in a straight line from the starting point to the first node,
    as well as in a straight line from the final node to the destination.
 
    Returns: Array if successful, -1 if not
 
    Script created by TheSnidr
    www.TheSnidr.com
*/
var startTime = current_time;
var navMesh = argument0;
var x0 = argument1;
var y0 = argument2;
var z0 = argument3;
var x1 = argument4;
var y1 = argument5;
var z1 = argument6;
var exact = argument7;

var nodeList = navMesh[eNavMesh.NodeList];

//////////////////////////////////////////////////////////////////////////////////////////////
//Clear necessary data structures
ds_priority_clear(global.NavMeshOpenSet);
ds_map_clear(global.NavMeshClosedSet);

//////////////////////////////////////////////////////////////////////////////////////////////
//Find starting and destination quads.
var startQuad    = navmesh_find_nearest_quad(navMesh, x0, y0, z0);
var endQuad        = navmesh_find_nearest_quad(navMesh, x1, y1, z1);
if (!is_array(startQuad) || !is_array(endQuad))
{
    show_debug_message("navmesh_find_path: Could not find start or end point. Returning -1.");
    return -1;
}

//////////////////////////////////////////////////////////////////////////////////////////////
//Clamp the start and end coordinates to their respective triangles
x0 = startQuad[4];
y0 = startQuad[5];
z0 = startQuad[6];
x1 = endQuad[4];
y1 = endQuad[5];
z1 = endQuad[6];

//////////////////////////////////////////////////////////////////////////////////////////////
//Check if the start and end triangles contain the same corner points. If they do, we can move in a straight line from start to finish.
var i, j;
for (i = 0; i < 4; i ++){
    for (j = 0; j < 4; j ++){
        if (startQuad[i] == endQuad[j]){break;} //Check if this node exists in both the start and end point
    }
    if (j == 4){break;} //If the node does not exist in both sets, break the loop early
}
if (i == 4){
    //show_debug_message("navmesh_find_path: Start and end points are in the same convex shape (triangle or quad), and we can move in a straight line.");
    return [x0, y0, z0, x1, y1, z1];}

//////////////////////////////////////////////////////////////////////////////////////////////
//Add the gscore and fscore of the nodes in the starting quad
var node, fScore;
var heuristic = 0;
for (i = 0; i < 4; i++)
{
    node = startQuad[i];
    heuristic = sqr(x1 - node[0]) + sqr(y1 - node[1]) + sqr(z1 - node[2]);
    node[@ eNavNode.CameFrom] = undefined;
    node[@ eNavNode.Gscore] = point_distance_3d(node[0], node[1], node[2], x0, y0, z0);
    fScore = (exact ? sqrt(heuristic) : heuristic);
    ds_priority_add(global.NavMeshOpenSet, node[eNavNode.ID], fScore);
}

//////////////////////////////////////////////////////////////////////////////////////////////
//Start looping through all reachable points
var current, nbArray, dtArray, nbNum, gscore, tent_gscore, old_gscore, not_in_open_set, ID, nbID, dx, dy, dz;
while (ds_priority_size(global.NavMeshOpenSet) > 0)
{
    //Find the node in the open set with the lowest f-score value
    ID = ds_priority_delete_min(global.NavMeshOpenSet);
    current = nodeList[| ID];
    if (current == endQuad[0] || current == endQuad[1] || current == endQuad[2] || current == endQuad[3]){break;}
    global.NavMeshClosedSet[? ID] = true;
 
    //Loop through the neighbours of the node
    gscore  = current[eNavNode.Gscore];
    nbArray = current[eNavNode.NbArray];
    dtArray = current[eNavNode.DtArray];
    nbNum = array_length_1d(nbArray);
    for (i = 0; i < nbNum; i ++)
    {
        node = nbArray[i];
        nbID = node[eNavNode.ID];
   
        //Continue the loop if the node is in the closed set
        if !is_undefined(global.NavMeshClosedSet[? nbID]){continue;}
   
        //Check whether or not this is a better path
        tent_gscore = gscore + dtArray[i];
        not_in_open_set = is_undefined(ds_priority_find_priority(global.NavMeshOpenSet, nbID));
        if !not_in_open_set and tent_gscore >= node[eNavNode.Gscore]{continue;} //This is not a better path
   
        //This path is the best until now. Record it!
        heuristic = sqr(x1 - node[0]) + sqr(y1 - node[1]) + sqr(z1 - node[2]);
        node[@ eNavNode.CameFrom] = current;
        node[@ eNavNode.Gscore] = tent_gscore;
        fScore = tent_gscore + (exact ? sqrt(heuristic) : heuristic);
        if not_in_open_set
        {
            ds_priority_add(global.NavMeshOpenSet, nbID, fScore);
        }
        else
        {
            ds_priority_change_priority(global.NavMeshOpenSet, nbID, fScore);
        }
    }
}

//////////////////////////////////////////////////////////////////////////////////////////////
//If the last node is not in the end triangle, we couldn't find any valid paths
if (current != endQuad[0] && current != endQuad[1] && current != endQuad[2] && current != endQuad[3])
{
    show_debug_message("navmesh_find_path: Could not find any valid paths, returning -1.");
    return -1;
}

//////////////////////////////////////////////////////////////////////////////////////////////
//Read all the path points from the ds_map backwards
var backwardspath = ds_list_create();
while !is_undefined(current)
{
    ds_list_add(backwardspath, current);
    current = current[eNavNode.CameFrom];
}
var num = ds_list_size(backwardspath);

//////////////////////////////////////////////////////////////////////////////////////////////
//Initialize the path point array
var size = num * 3 + 6;
var path = array_create(size);
path[0] = x0;
path[1] = y0;
path[2] = z0;
for (var i = 0; i < num; i ++)
{
    node = backwardspath[| num-i-1];;
    path[i * 3 + 3] = node[0];
    path[i * 3 + 4] = node[1];
    path[i * 3 + 5] = node[2];
}
path[size - 3] = x1;
path[size - 2] = y1;
path[size - 1] = z1;

//////////////////////////////////////////////////////////////////////////////////////////////
//Clean up
ds_list_destroy(backwardspath);

show_debug_message("navmesh_find_path: Found shortest path in " + string(current_time - startTime) + " milliseconds.");
return path;
It's made to work in 3D as well:
 
Last edited:

vdweller

Member
@TheSnidr

Interesting post! I have a potato PC with an Intel Quad core Q9550. I have uploaded my .exe here, let me know how much time (3rd line on the right of the screen, in microseconds) it takes you to go from the bottom-left of the map to the upper right.
Shift-Lclick to set start point, Shift-Rclick to set end point, press Space to run.

I improved my originally posted algorithm by finally removing the open list altogether, this made the algorithm run 10-15% faster. Without turning a blind eye to your observation about cleanup, so far upon benchmarking it's like 1/25 of the entire run time, so I was looking for other ways to shave off more time from other parts of the code. Also in my version neighbors are pre-calculated. pf_getneighbors() just returns the neighbor list, the reason it exists is in order to get any extra neighbors in the future (portals, nearby separate grids etc, as per the goals I set in my first post). Inlining its content has almost zero impact on speed.

EDIT: Simply using the "standard" node neighbors list, there is a 40% increase in speed. Perhaps "special" node neighbors like the ones I suggested above must be pre-inserted too.

My next presentation was going to be about HPA*. It's a simple yet effective way to speed up pathfinding on large maps. The only issue I have so far is that it takes a lot of time to reconstruct edge nodes when adding/removing an obstacle on an edge area. Then I got caught with Jump Point Search. I had mixed results with it. The out-of-the-box version was 2-10x slower than vanilla A*. With some preprocessing (the kind that is easy to recalculate online if needed) and taking into account Harabor et al. suggestion of intermediate node pruning, I managed to get it to run 2-10x faster than vanilla A*. Introducing weighted maps proved to be a bit problematic for JPS, in the sense that it kills any speed gains by having pre-identified points of interest on the map.

Ultimately, I suspect that I may have introduced several major bottlenecks in my code. First of all, nodes are actually lists. Each element corresponds to a different property (cost,x,y,neighbors list etc). I don't think that Game Maker likes frequent list lookups. I tried to only read from a list only when absolutely necessary, even used bit compression for Jump Point Search, but it looks like if at any given grid square one does more than a single lookup, speed goes down the drain. I am really not sure how to approach this. It is easy to make a quick prototype where things are laid out flat (you can do A* even without a "node" structure), but in a real complex game you got to have some sort of data hierarchy or else things get too complicated too soon.

EDIT2: I use boundary checks on my algorithm. They are useful in HPA* where each A* has to be run within certain bounds. Obviously, this adds some extra slowdown. Of course there could be a "boundless" hierarchical A* and a bounded one when necessary.
 
Last edited:

TheSnidr

Heavy metal viking dentist
GMC Elder
HPA* sounds interesting! Haven't heard of that method before, seems like it has the potential to speed up pathfinding drastically.

Running your project, the time for generating a path from lower left to upper right seems to vary wildly between 35 and 80 milliseconds. Pressing space repeatedly, the lowest I got was 28 milliseconds, which is starting to become nice.
I also did a test with my own pathfinding system using your map. It calculated this path in 34 milliseconds:


And when using the square of the distance, it calculated this path in 4 milliseconds:


I'm honestly a bit surprized that using the square of the distance is that much faster, but I guess it makes sense, since the overshooting of the f-score makes it so that you don't have to check that many alternative paths. Of course though, it most likely won't return the shortest path, just a valid path.
 

vdweller

Member
Thanks for checking it out! By immediately getting a node's neighbors list instead of repopulating a list (in case any "special" neighbors are found), and by treating nodes as arrays, my algorithm does that path consistently at 30 milliseconds on my PC, even with boundary checks. So I guess we both got it as good as it gets, heh.

Don't be surprised about the rootless heuristic: Any cost f = w + εh, where ε>1, will have a similar effect. You can try for example an euclidean direction heuristic multiplied by 100. So it's not avoiding the square root that speeds up the process but rather using a more "relaxed" heuristic. On modern computers square root isn't much of a big deal anyways. However on hierarchical approaches like HPA* I found that using a relaxed heuristic is largely useless: For short distances no significant difference has the time and space to kick in.

The problem is that I keep reading on forums like stackexchange that, for example, in C++ an A* algorithm only takes 25 milliseconds in the worst case on a 300x300 map... Granted, maybe the guy has a really fast cpu but even an online javascript example does an 150x150 map in something like 5 milliseconds.The test map I sent you is 96x96... I may have been looking at this the wrong way, but I have doubts if YYC is indeed serious business. There are obvious speed gains compared to VM but it still falls behind other languages by a lot.
 

vdweller

Member
I downloaded this C# project from codeproject.org by CastorTiu and ran it. It does this path
upload_2019-7-14_19-53-55.png
in 0,4 milliseconds.

Exactly the same map, in GameMaker, my code, with YYC:
upload_2019-7-14_19-54-49.png
Time: ~2,3 milliseconds

Almost 6 times slower. This is bad...

What did they guy do? Space magic? He kinda explains certain optimizations in his article but I already use some of them and others I can't understand or aren't fully explained (like, what's that "calculation grid" he's talking about?) Any opinions on this?
 

vdweller

Member
Sadly, I managed to verify my suspicions.

I wrote the same pathfinding algorithm in C++. Same map (256x256), same start and end points. GameMaker code super-optimized, all speed-critical data structures are arrays (which, in a real game scenario poses problems, as other data structures are even slower). All code is available on request.

On GameMaker Studio (YYC): 346 milliseconds

On C++: 12 milliseconds

As you can see, a C++ implementation wipes the floor with YYC. And while I could understand a 3-4x speed downscale from pure C++ to YYC, we are talking here about an 28x speed downscale.
 
I experimented a little bit with the YYC with my own implementation of my custom pathfinder, its not A* but returns the same path on normal mode and in optimized pre loop-unrolled mode its virtually the same path.

Here are 2 screenshots of the grid you tested in C#, at the top you will see the stats
it took, 94 microseconds or 0.09 milliseconds to generate the grid and 0.19 milliseconds to generate the path back total: 0.28 milliseconds
Optimized mode but path is SLIGHTLY different


Unoptimized same path as A* demo
0.12 milliseconds to generate and 0.18 for the path back, total: 0.3 milliseconds *note the optimization is in the generation of the grid, both versions use the same pathing back script


and finally where it really shines! multiple units, here is it path finding 2000 units
in this case the screen cap wasn't the most optimal but i made a rule when taking these screenshots to not cherry pick
0.43 for the grid generation and 75.17 milliseconds for the unit paths for a combined total of 75.6 milliseconds for all 2000.


I also took the liberty of duplicating the 100x100 test map with all its obsticals, here are the results!
Grid gen: 0.89 milliseconds, Path: 0.33 milliseconds for a total of 1.22 milliseconds :) same path achieved!



and with 1002 units
4.37 milliseconds for the grid and 147.97 for the pathing total: 152.34 milliseconds

Its not A* but if you guys are interested ill post the .gmz file here as well :)
Demo exe: https://mega.nz/file/Qd0wjQJL#bv4knQe4I4zIgB8N9qK4Sm5yfJVAOmORUfrnw-qPv-U

Controls:
PRESS SHIFT while placing goal to use the optimized speed mode

Left-Click: Set goal, (SHIFT)+ Left-Click - Set goal OPTIMIZED
Right-Click: Add Unit, CTRL: Show Numbers Cell
Scroll: Zoom, Middle-Mouse: Move View
Enter: Show unit count
Space: Follow Path
 
Last edited:

vdweller

Member
I experimented a little bit with the YYC with my own implementation of my custom pathfinder, its not A* but returns the same path on normal mode and in optimized pre loop-unrolled mode its virtually the same path.

Here are 2 screenshots of the grid you tested in C#, at the top you will see the stats
it took, 94 microseconds or 0.09 milliseconds to generate the grid and 0.19 milliseconds to generate the path back total: 0.28 milliseconds
Optimized mode but path is SLIGHTLY different


Unoptimized same path as A* demo
0.12 milliseconds to generate and 0.18 for the path back, total: 0.3 milliseconds *note the optimization is in the generation of the grid, both versions use the same pathing back script


and finally where it really shines! multiple units, here is it path finding 2000 units
in this case the screen cap wasn't the most optimal but i made a rule when taking these screenshots to not cherry pick
0.43 for the grid generation and 75.17 milliseconds for the unit paths for a combined total of 75.6 milliseconds for all 2000.


I also took the liberty of duplicating the 100x100 test map with all its obsticals, here are the results!
Grid gen: 0.89 milliseconds, Path: 0.33 milliseconds for a total of 1.22 milliseconds :) same path achieved!



and with 1002 units
4.37 milliseconds for the grid and 147.97 for the pathing total: 152.34 milliseconds

Its not A* but if you guys are interested ill post the .gmz file here as well :)
Demo exe: https://mega.nz/file/Qd0wjQJL#bv4knQe4I4zIgB8N9qK4Sm5yfJVAOmORUfrnw-qPv-U

Controls:
PRESS SHIFT while placing goal to use the optimized speed mode

Left-Click: Set goal, (SHIFT)+ Left-Click - Set goal OPTIMIZED
Right-Click: Add Unit, CTRL: Show Numbers Cell
Scroll: Zoom, Middle-Mouse: Move View
Enter: Show unit count
Space: Follow Path
Hi, it would be good to post some code as well so that we have a general idea on how your pathfinding method works.
 
Last edited:
Hi, it would be good to post some code as well so that we have a general idea on how your pathfinding method works.
Well here you all go :), Version 1, Ill be releasing more updates to this on here before releasing it as a free asset pack.
any ideas to improve anything in terms of speed let me know :) trying to make it as fast as possible before release for the general community.
Ive noticed some improvements i can make with the pathing back to speed things up and yesterday also worked on an Early exit feature for the grid flood fill
It currently includes all scripts including testing and failed experiments tests. Built for GMS 1.4 by the way.

The optimized flood fill mode is currently still not 100% and has some cases where it cannot flood to a part of a winding map. Ill fix this soon

Edit*
controls: just realized I forgot to update the messagebox at the beginning that gives you the controls soooo
pressing 0 will hide the text that tells you what each number does to effect pathfinding
pressing 1 will enable the optimized flood fill mode for the next placed goto command, its usually about 30% faster then the regular flood fill, but is ultimately broken untill its fixed.
pressing 2 will enable the early path exit mode, this takes all the units on the screen and will do the regular flood fill but stop when all units have been reached. when units are close by this really speeds up
the flood fill process, however with alot of units and if they are placed far away from the goal the overhead of checking maybe more expensive then just a brute force flood.

pressing 3 will enable the optimized pathing I began experimenting, rather then having the next point in the path generated at a fixed switch case, statistically a unit moving in one direction will tend to move
in that direction for a while, moving the script arround so that the switch case statment will be more likely to be near the first checked.

having 1,2 and 3 off enables normal mode with not early exits and the standard path generator.


Pressing CTRL will show you the cell numbers
Pressing Enter will display the number of units in the level
Pressing and holding space will make the units follow the path

Zooming and moving is done by the middle mouse button and scrolling
Holding left click will recalculate the whole path process as fast as possible over and over
Holding Right Click will add a unit once per frame.


After 350 units, the flood fill gets converted into a flowfield, this process has not been optimized much yet and is next on my list and i have a few sneaky ideas to speed that part up.
this step is a bit of a trade off, it has to loop over the entire world and check every cells neighbours to find what direction the next cell will be to the goal. doing this allows each unit to only have to check one cell
per path generation step instead of 8, so it becomes a balancing act to work out exactly when which mode should be used to maximize performance.

anyway guys alot of info here, have a little experiment.



Please let me know what times you end up pulling in YYC mode :)


Enjoy!
 
Last edited:

vdweller

Member
Well here you all go :), Version 1, Ill be releasing more updates to this on here before releasing it as a free asset pack.
any ideas to improve anything in terms of speed let me know :) trying to make it as fast as possible before release for the general community.
Ive noticed some improvements i can make with the pathing back to speed things up and yesterday also worked on an Early exit feature for the grid flood fill
It currently includes all scripts including testing and failed experiments tests. Built for GMS 1.4 by the way.

The optimized flood fill mode is currently still not 100% and has some cases where it cannot flood to a part of a winding map. Ill fix this soon

Edit*
controls: just realized I forgot to update the messagebox at the beginning that gives you the controls soooo
pressing 0 will hide the text that tells you what each number does to effect pathfinding
pressing 1 will enable the optimized flood fill mode for the next placed goto command, its usually about 30% faster then the regular flood fill, but is ultimately broken untill its fixed.
pressing 2 will enable the early path exit mode, this takes all the units on the screen and will do the regular flood fill but stop when all units have been reached. when units are close by this really speeds up
the flood fill process, however with alot of units and if they are placed far away from the goal the overhead of checking maybe more expensive then just a brute force flood.

pressing 3 will enable the optimized pathing I began experimenting, rather then having the next point in the path generated at a fixed switch case, statistically a unit moving in one direction will tend to move
in that direction for a while, moving the script arround so that the switch case statment will be more likely to be near the first checked.

having 1,2 and 3 off enables normal mode with not early exits and the standard path generator.


Pressing CTRL will show you the cell numbers
Pressing Enter will display the number of units in the level
Pressing and holding space will make the units follow the path

Zooming and moving is done by the middle mouse button and scrolling
Holding left click will recalculate the whole path process as fast as possible over and over
Holding Right Click will add a unit once per frame.


After 350 units, the flood fill gets converted into a flowfield, this process has not been optimized much yet and is next on my list and i have a few sneaky ideas to speed that part up.
this step is a bit of a trade off, it has to loop over the entire world and check every cells neighbours to find what direction the next cell will be to the goal. doing this allows each unit to only have to check one cell
per path generation step instead of 8, so it becomes a balancing act to work out exactly when which mode should be used to maximize performance.

anyway guys alot of info here, have a little experiment.



Please let me know what times you end up pulling in YYC mode :)


Enjoy!
Well bro I took a look at your project and I couldn't get much out of it haha. There are many esoteric names like changes2 (a grid) and some variables are instance variables and I am not sure they have to be. It appears to be some sort of flow field algorithm. Although I doubt I will have any personal use for it, I think it would be interesting if you elaborated a bit on the general principles behind it. Like, maybe do a point-to-point description on how each component works, starting from the basic, vanilla, unoptimized version or maybe throw some pseudocode or an explanatory image here and there :) Also if your work is based on a paper or some other source, you could add a link too! But then if you don't have time for all that (like I don't have the time to trace and decipher it), I totally understand!
 
Well bro I took a look at your project and I couldn't get much out of it haha. There are many esoteric names like changes2 (a grid) and some variables are instance variables and I am not sure they have to be. It appears to be some sort of flow field algorithm. Although I doubt I will have any personal use for it, I think it would be interesting if you elaborated a bit on the general principles behind it. Like, maybe do a point-to-point description on how each component works, starting from the basic, vanilla, unoptimized version or maybe throw some pseudocode or an explanatory image here and there :) Also if your work is based on a paper or some other source, you could add a link too! But then if you don't have time for all that (like I don't have the time to trace and decipher it), I totally understand!
As I said, I wanted to finish it entirely before I released it changing any crazy variable names that were for testing, but I have alot of code to write for my actual job tomorrow so I wont have a chance to do anything with it the next few days. I tried to make as many variables as I could local rather then instance, however due to the nature of scripts that wasn't possible without having extra arguments or using global's, and after testing various variable scopes that was by far the fastest combo ive come up with, this also involved looking at the YYC compilled C++ code to work out what operations were using the least amount of processing power.

The code needs to be commented and I will write a white paper or some info on how to use it and explaining how each step works, however I wanted you all to have a look since you expressed interest and I cant work on it for a few days. It seems unreadable as almost every division and multiplication has been simplified to bitshifts and stripped down as fast as I could make it, every addition and subtraction was timed and multiple methods were tested to find the fastest method of doing said mathematical operation. the works is based on my time making solutions to problems while ive worked at my job as a C# developer. 5 months ago my friend luke and I spend alot of time writing our own competing path finders and we shared multiple ideas for it, he built his in C++ and mine in YYC GML, he ultimately won by x2.2 speed increase but he said from the beginning if i can get within x10 speedI should be declared the winner. So im happy with the speeds.

I am writing up part of the white paper in word at the moment with explainations and cell expansion diagrams included :)

If I may ask what speeds did you get for the path on the 100x100, the same one you did in your orriginal post? try it with the early exit mode (option 2) and let me know just curious, it will be the Path flood milliseconds + the Units milliseconds :), even if i place the the goal close to the unit like 6 cells away with no obsticals, it still outperforms A* "Using early exit mode", which is where A* should beat a flood fill as it litterally paths straight to it rather then flooding evenly like mine does. 0.07 milliseconds to 0.13



If you dont want to ever use it thats fine but may i ask why you think it wont be suitable? in normal mode or early exit mode it will always return the most optimal path, and it used by just calling one generation script with a target destination, and then simply calling the get path script on what ever unit you want to go there?

let me know any uncertanties you have about it and ill be happy to explain how any parts of it work.

For the record the changes grid is a list of new cells to be expanded, Much faster then appending to a new list. :)
once the white paper explains it most of it will make sense, The code could be "Tidied up" but alot of the way its built is built that way on purpose, especially on the grid flood fill and pathing back scripts.
 
Last edited:

vdweller

Member
OK from what I've gathered you have implemented a version of Goal-Based Vector Field Pathfinding (VFP). On my potato PC it takes 4-5 ms to fill the grid and on my Final Boss PC it takes about 2 ms. Note I'm using GMS2.

VFP is used mainly in RTS games and it is actually a quite good choice for moving lots of units. Every pathfinding algorithm has its own strengths and weaknesses and the more extra features you try to include (ie things not initially considered in the introductory implementation of the algorithm), the more it loses its edge over others.

For example, Jump Point Search (JPS) is a skip-ahead A*, where it prunes nodes in the Open list based on certain points of interest around obstacles. JPS can, and will blow A* out of the water in most scenarios *. However, JPS was initially conceived for a weightless grid... While it is possible to expand its functionality for a weighted grid (essentially you also treat any terrain change as yet another point of interest), one may find that on maps with frequent terrain changes JPS loses its power very, very quickly.

The reason VFP is used so commonly in RTS games is that in these games you usually have a uniform map representation (grid/hexagons), the obstacle grid is usually static and also terrain cost is often uniform. I am not so sure how well VFP scales when modified to work on a dynamic, weighted terrain, and since it can potentially fill the entire map, it may scale even worse on really huge maps compared to other solutions. So the claim that it is "faster than A*" in general may not be entirely true for all scenarios. And in scenarios like the ones highlighted in my first post, like multiple grids or teleporters, VFP is probably not capable of delivering at all, but a region-based Dijkstra for example can and will deliver.

* Did you know? In GML, a vanilla JPS approach can be slower than a vanilla A* approach! The reason behind this is that, in the usual programming languages, it is cheaper to look at a bunch of grid values than maintain a huge-ass priority list and this is why JPS is faster. However, due to the way things are implemented in the YYC, looking for example at 1000 grid values can be slower than maintaining a ds_priority with 1000 values, since the latter is written in...erm...more immediate C++ compared to C++ - converted GML code. Really classy, YoYo.
 
i have since made a new final release version, this is simplified down and removes the buggy optimized version. it is all script based and is called really simply, not sure if i could make it any more simple now :)

It also handles obstical movement as each obstical is removed and readded each time the flood fill is done :)
wrappering it all into scripts probably loses a few microseconds here and there but i think the ease of use payoff is worth it., I might also try and implement teleporters and other things into it as features for v2

it also has early exits so it will only fill the map in worse case scenarios :)

 
Top