• Hey Guest! Ever feel like entering a Game Jam, but the time limit is always too much pressure? We get it... You lead a hectic life and dedicating 3 whole days to make a game just doesn't work for you! So, why not enter the GMC SLOW JAM? Take your time! Kick back and make your game over 4 months! Interested? Then just click here!

Multithreading / Extension Support in GMS

xenoargh

Member
This is, unfortunately, going to be pretty vague / open-ended, so I'm apologizing in advance.

For one of my previous projects, in Java, I utilized a multithreaded approach to certain drawing and AI functions. My game's currently running, stable and largely stays in the 300-500 FPS range right now, with most of the major features in place, but there are occasional heavy events that require serious horsepower, and I'd like to explore what can be multithreaded or pushed to pure C#. So, here are some newbie Q's. Again, sorry if these seem woefully ignorant.

1. My limited understanding is that GMS can be extended via DLLs or their OS-specific equivalent. These would (typically) be built in <language of choice> in something like Visual Studio, then used as Extensions in GMS. There isn't much documentation about this, other than the basics, and so I have some serious questions about the utility of doing these things.

For example, one of the things I'd really like to do is accelerate some code that deals with large collections of points. I don't see anything in the documentation suggesting that I can send complex data structures (ds_lists, for example) via a call to an Extension. How is this typically worked around?

2. My previous experience with multithreading was via the Lombok library for Java. If using C# instead, does anybody recommend a library that provides (relatively) simple access to thread-safe methods, locking, etc.?

3. If passing multithreaded code back to GMS, what happens if it can't return during the current frame? Will GMS's main frame be halted awaiting results? Will it crash? Be garbage-collected?

4. If I just want to squeeze a little more optimization out of my game for certain math-heavy operations, are there other ways I should explore this, like writing Extension code to handle very specific issues?

5. What about memory management? GMS is already a bit alarming in how it handles certain data types; with Extensions, this seems like it might be one of the big challenges. Is it typical, for example, in developing Extensions for GMS, to set up memory allocations via statics at runtime, so that there aren't leaks, which means variables have to be scoped carefully and overflows of, say, arguments consisting of many floating-point numbers will cause a crash? Do I have to be wary about creating temporary variables, beyond the usual care one requires with any language?

Anyhow, sorry for the vagueness. I'm staring at GMS's documentation right now, and the limit to "4 arguments, unless they're all the same type", "don't send data structures", "numbers are always doubles", etc. is all a little confusing / intimidating, and there's pretty much zero information in the documentation about dealing with thread safety in state machines, etc.

Here's a typical piece of code I'd like to explore accelerating, because it's so inherently slow in GMS:
GML:
            if(isCircle = false){
                var leftNum, rightNum;
                for(var i = 0; i < aura_shadow_points; i++;){
                    a_tx[i] = ((x + aura_shadow_x[i])  - a_ox) + a_rad;
                    a_ty[i] = ((y + aura_shadow_y[i])  - a_oy) + a_rad;
                    a_dir = point_direction(a_tx[i], a_ty[i], a_rad, a_rad) + 180;
                    if(i = 0){
                        rightX = a_tx[i];
                        rightY = a_ty[i];
                        leftX = a_tx[i];
                        leftY = a_ty[i];                                
                        rightAngle = a_dir;
                        leftAngle = a_dir;
                        leftNum = 0;
                        rightNum = 0;
                    } else {
                        if(angle_difference(a_dir,rightAngle) > 0){
                            rightX = a_tx[i];
                            rightY = a_ty[i];    
                            rightAngle = a_dir;
                            rightNum = i;
                        }
                        if(angle_difference(a_dir,leftAngle) < 0){
                            leftX = a_tx[i];
                            leftY = a_ty[i];    
                            leftAngle = a_dir;
                            leftNum = i;
                        }                                
                    }
                }
            } else {
                var offX = (x - a_ox) + a_rad -1.0;
                var offY = (y - a_oy) + a_rad -1.0;
                var centerAngle = point_direction(offX,offY,a_rad,a_rad)+180;
                var rightPAng = centerAngle + 90;
                var leftPAng = centerAngle + 270;
                var size = (abs(sprite_width)*0.5)-1;
                rightX = (offX) + lengthdir_x(size,rightPAng);
                leftX = (offX) + lengthdir_x(size,leftPAng);
                rightY = (offY) + lengthdir_y(size,rightPAng);
                leftY = (offY) + lengthdir_y(size,leftPAng);    
                rightAngle = point_direction(rightX,rightY,a_rad,a_rad)+180;
                leftAngle = point_direction(leftX,leftY,a_rad,a_rad)+180;                            
            }
            //Now that we have the furthest-right and furthest-left, let's draw.
            //First, we need to get the far-away points on the triangles.
            rightFarPointX = rightX + lengthdir_x(a_rad_big,rightAngle);
            rightFarPointY = rightY + lengthdir_y(a_rad_big,rightAngle);
            leftFarPointX = leftX + lengthdir_x(a_rad_big,leftAngle);
            leftFarPointY = leftY + lengthdir_y(a_rad_big,leftAngle);
        
            if(!canMoveEver){
                ds_map_add(myRightX,obstacleID,rightX);
                ds_map_add(myRightY,obstacleID,rightY);
                ds_map_add(myRightFarPointX,obstacleID,rightFarPointX);
                ds_map_add(myRightFarPointY,obstacleID,rightFarPointY);            
                ds_map_add(myLeftX,obstacleID,leftX);
                ds_map_add(myLeftY,obstacleID,leftY);
                ds_map_add(myLeftFarPointX,obstacleID,leftFarPointX);
                ds_map_add(myLeftFarPointY,obstacleID,leftFarPointY);        
            }
None of the steps are particularly slow, but the processing required is quite heavy when we're talking about many Instances being evaluated; there's a lot of trig involved (and a fair amount before we get this code, doing distance checks and so forth).

However, I'd have to send all the data being acted upon as doubles, then send it back (and it's not entirely clear how one returns data; can I return an array of doubles, or am I limited to exactly one double, or a string I'd have to laboriously convert back into an array?). If I'm limited to a return value of one double, there's really no point in any of this being in an extension, of course. Honestly, the more I look at this, the less pleased I am; what use is something that only returns one value, and, if not GML, cannot call GM internal functions directly? Or is there some way to do that, and documentation of the functions somewhere?
 
G

Gaijin

Guest
1) Via buffers.
2) No idea, never used C++ before.
3) Crash.
4) Yes, but just be sure to benchmark the differences. Small short repeated calls will tend to end up slower.
5) Usual language specific memory management rules apply.

If I'm limited to a return value of one double, there's really no point in any of this being in an extension, of course. Honestly, the more I look at this, the less pleased I am; what use is something that only returns one value, and, if not GML, cannot call GM internal functions directly? Or is there some way to do that, and documentation of the functions somewhere?
Buffers are what you want (ty_string). Ty_string is actually a pointer, so it doesn't have to refer to a string at all. It's name is a little misleading.
 

Padouk

Member
Mmmmmmmmmmmmmmmaaaaan you question is vague. I love it! It gives room for long anwsers and i'm bored right now.

--

I read your loop a couple of times and your overall text and I've reach a conclusion. You are trying to run before you can even crawl.
I feel your frustration, I've been there and a lot of other hardcore programmer have been there as well.
YYG is making great effort for hidding the complexity of creating games while still providing some very interesting functionnality and protability across devices.

In order to do that, they have to wrap and workarround way more issues than you'd expect.
The Extension are open door for extending what they forgot, more than for improving where they failed at.

--

I read your loop a couple of times... My short answer is you won't gain much trying to replicate that same code inside an extension. Code will have the same performances pretty much.
Unless you are going for experimental purposes like wrapping some Intel Performance Primities or doing some opencv image processing, you won't see much benefits from just computing points in native c++ as opposed to gml.

I'd encourage you to learn about the impact of conditional branching (if statements) in loops. and how games are typically using Shaders and Matrices

For Massive point processing: I'd refer you to Shaders and the Matrix api https://docs2.yoyogames.com/source/_build/3_scripting/4_gml_reference/matrices/index.html.

--

Explore what can be multithreaded or pushed to pure C#
I'm assuming you mean C++?
Dll produced with C# are "managed dlls" as opposed to "native dlls" You won't like your experience mapping C#
Again, maybe you really ment C# and you really intend using something like Xamarin to transpile it into native dll. That's a 2.5G library offred with Visual Studio Pro+

For example, one of the things I'd really like to do is accelerate some code that deals with large collections of points.
Again, pointing you to Shaders and Matrices

I don't see anything in the documentation suggesting that I can send complex data structures (ds_lists, for example) via a call to an Extension. How is this typically worked around?
For that.. use buffers
Your extensions can only receive two types: double or string... the "string" is polymorph and can be used to receive either a GML string or a buffer_get_address(bid)

From inside of you extension you can only call back into GMS using "event" and a special type of ds_map. (For more info on that: https://help.yoyogames.com/hc/en-us...es-From-An-Extension-Asynchronously-GMS-v1-3-)

GML:
bid = buffer_create(1024*1024, buffer_fixed, 1);
for each p in whatever...

  buffer_write(bid, buffer_f32, p.x)

  buffer_write(bid, buffer_f32, p.y)

your_cool_extension(buffer_get_address(bid), buffer_tell(bid))


And on the c++ side:
Code:
void your_cool_extension(void* buffer, double buffer_size) {

  //prettysure you know what to do here.

}

2. My previous experience with multithreading was via the Lombok library for Java. If using C# instead, does anybody recommend a library that provides (relatively) simple access to thread-safe methods, locking, etc.?
C# comes with locks, concurrent structures, task scheduling, thread pooling, .. it's all built-in in the initial framework or accessible and maintained by microsoft if you got .net core.

Depending on your real needs you shouldn't require any specific nuget packages outsides of the microsoft proposal.
If your intend is to Host a C# server and call it from GMS then yeah you won't need much more than the core components.
If your intend is to Transcompile your C# into native. You will need the 2.5G worth of Xamarin setup provided with Visual Studio Pro+

Here are a few keywords to help in you better scope your next question
Code:
static object _somelock = new object();

//Semaphores
void somefunction() {
  lock(_somelock) {
    //This code is protected with a positive semaphore and only thread can execute it.
  }
}

//Thread Safe Collections
ConcurrentDictionary<string, CoolClass> threadSafeDictionary = new ConcurrentDictionary<string, CoolClass>();
threadSafeDictionary.TryAdd(key, value); //This call is thread safe.


//Parallel Linq: https://docs.microsoft.com/en-us/dotnet/standard/parallel-programming/introduction-to-plinq
List<CoolClass> yourList = new List<CoolClass>()
yourList.AsParallel().Where(p => p.IsCoolEnough).Max(p => p.Value); //Filter will be done in parallel. Max will be done using MultiThread MapReduce if necessary.



//For Long running tasks (Thread Pooled Calls): https://docs.microsoft.com/en-us/dotnet/api/system.threading.threadpool?view=net-5.0
ThreadPool.QueueUserWorkItem(ThreadProc);



//For Asynchrone Task
.... man list goes on: have a look at https://docs.microsoft.com/en-us/dotnet/standard/parallel-programming/
3. If passing multithreaded code back to GMS, what happens if it can't return during the current frame? Will GMS's main frame be halted awaiting results? Will it crash? Be garbage-collected?
You can't really. GMS is shielding you from mistakes like that.
GMS's main thread (game loop) will call your extension and will wait until your function call is completed locking the main thread for that duration.
At that point if you fork or run some thread you're only way back into GMS is through an "async event". So your random thread would queue a ds_map for GMS to consume it in due time.
Those queues are then processed by GMS' main thread in due time avoiding any such concurrency issues.
Of course, this is if you are not hacking the provided code in anyway bypassing the safeguards they they put in place...


Note: this mechanism is more suitable for Async Calls than for Parallel processing. I personnaly only see it's value when there are some IO call involved like file opening, network access.. anything not cpu related.

The overhead you have for copying data from the various GMS structure back into a buffer, down into the extension is usually hard to compensate with any kind of parallel processing.



4. If I just want to squeeze a little more optimization out of my game for certain math-heavy operations, are there other ways I should explore this, like writing Extension code to handle very specific issues?
Define math-heavy operations? I mean... atan2's gml performance will be pretty close in GML and in C++ ... the extension overhead might slow you down actually.
If you want to apply "THE SAME" operation on "MANY POINTS".. forget about Extension... you can't really beat the hardware and GML is already providing Shaders and Matrices.

5. What about memory management? GMS is already a bit alarming in how it handles certain data types; with Extensions, this seems like it might be one of the big challenges. Is it typical, for example, in developing Extensions for GMS, to set up memory allocations via statics at runtime, so that there aren't leaks, which means variables have to be scoped carefully and overflows of, say, arguments consisting of many floating-point numbers will cause a crash? Do I have to be wary about creating temporary variables, beyond the usual care one requires with any language?
Games are usually running in closed loop. Meaning it's easy to estimate how much memory your extension will need right from the beginning

I like to use the thumb rules that no memory should be allocated by the extension without explicite request from GMS
For that.. I usually add a _create and a _destroy function binding with responsibility to malloc or freed when requested by the game

I also like to manage only one extern (what you refer as static) ... that's a me thing.. but I usually create a C++ Class and use a single pointer to it for all memory management.

Code:
class MyExtensionContext {
public:

  void action(Point* points, int count);

private:
  Point fixedWorldState[256];  //I prefer to work with predefined limits
  std::vector<Point> relaxedWorldState; //Some people prefer to have them more flexible
};



//This is that single static memory managed item
extern MyExtensionContext* p_staticContext = NULL;



extern "C" __declspec(dllexport) double your_coolextension_create() {
  assert(p_staticContext == NULL, "Singleton context already initialized. Don't call your_coolextension_create more than once");
  p_staticContext = new MyExtensionContext();

  return 1.0; //success
}

extern "C" __declspec(dllexport) double your_coolextension_action(void* buffer, double buffer_size) {
  assert(0 < buffer_size && buffer_size < 0xFFFFFFFF, "You will start to have floating points precision issues after that depending on the platform");
  assert(p_staticContext != NULL, "You forgot to call your_coolextension_create");

  Point* points = reinterpret_cast<Point*>(buffer);

  int count = buffer_size / sizeof(Point);

  assert(count * sizeof(Point) == buffer_size, "Incomplete buffer.. dude don't try to fool me that's not what I was expecting");

  p_staticContext->action(points, count);

  return 1.0;
}

extern "C" __declspec(dllexport) double your_coolextension_destroy() {
  assert(p_staticContext != NULL, "Don't call your_coolextension_destroy more often than  your_coolextension_create");

  delete p_staticContext;
  p_staticContext = NULL

  return 1.0; //success
}
Anyhow, sorry for the vagueness. I'm staring at GMS's documentation right now, and the limit to "4 arguments, unless they're all the same type", "don't send data structures", "numbers are always doubles", etc. is all a little confusing / intimidating, and there's pretty much zero information in the documentation about dealing with thread safety in state machines, etc.
No more than 4 arguments: That's a true statement and to me that's more than enough!
Only double and string: That's a sad story.. most of us would like more....
Number are always double: Yeah! I would even say they are not "true double" I personnaly don't trust any value over 32bit anymore. That might be due to me mixing javascript and c++. I something forget where the real limit is..
There's pretty much Zero information: Take a good breath you are shooting everywhere! Crawl, walk, run bud... Make a simple extension based one: https://help.yoyogames.com/hc/en-us...es-From-An-Extension-Asynchronously-GMS-v1-3- then come back to the Forum, that's where the real documentation is... Cause yeah YYG is hiding the complexity so that complex stuff is not so well documented.

Here's a typical piece of code I'd like to explore accelerating, because it's so inherently slow in GMS:
Have a look at physic_fixtures. you might find some interesting stuff.
Have a look at Matrices (if you haven't already)
And have a look at buffers (if you haven't already)

However, I'd have to send all the data being acted upon as doubles, then send it back (and it's not entirely clear how one returns data; can I return an array of doubles, or am I limited to exactly one double, or a string I'd have to laboriously convert back into an array?).
You can create an "event" which asynchronously return a ds_map... i'd recommend to keep it small.
You can "modify in place" the buffer received by your function so your no need to return anything else..
Try it.. you'll see

GML:
var bid = buffer_create(4096, buffer_fixed, 1);
buffer_write(bid, buffer_f32, 100);
buffer_write(bid, buffer_f32, 100);
buffer_write(bid, buffer_f32, 100);
buffer_write(bid, buffer_f32, 100);
buffer_seek(bid, buffer_seek_start, 0);
your_coolextension_action(buffer_get_address(bid), 4096);


var p0_x = buffer_read(bid, buffer_f32);
var p0_y = buffer_read(bid, buffer_f32);
var p1_x = buffer_read(bid, buffer_f32);
var p1_y = buffer_read(bid, buffer_f32);
Code:
extern "C" __declspec(dllexport) double your_coolextension_action(void* buffer, double buffer_size) {

  Point* points = reinterpret_cast<Point*>(buffer);
  point[0].x += 8;
  point[1].y += 9;

  return 1.0;
}
 
Last edited:

xenoargh

Member
First off, thanks to both of you re: Buffers; I thought when they said, "String" in the documentation, they meant just that. I don't think I've used a Buffer anywhere, because I didn't see what I'd use it for, yet. So, write to a Buffer via a Shader, eh? That's an interesting idea.

I'm just kind of looking at options to give myself more headroom, honestly, which is probably pointless; it's already doing most of the visual tricks I need it to, and it's running fast, in most circumstances; I just always want more speed, especially for stuff that gets crunchy. Probably a fool's errand until I'm absolutely sure I know where my hot-spots are and know I'm no longer making changes to the GML side.
 
Top