GameMaker Lag-Friendly Real-Time Networking Tips

Anixias · May 26, 2018

I'm pretty able to write a client-server system really easily, and all of my past attempts (over 40 different projects, I'm sure) had client-server networking topologies (is topologies the right word here?).

However, I almost always have weird workarounds to making a direct real-time multiplayer game for various reasons that I know theoretically how to fix, but not practically. Basically, I either don't sync enemies/obstacles and don't make the server authoritative over client positions so that basically each player sees where other players say they are, and don't do any fact checking (this system was used in an obstacle-course multiplayer game I made).

In my game Sona's Tower, my most well-polished and content-filled game thus far (none of my games are online so you won't find it), I used a system where enemies were client-side, and you could just see other players and not interact with them in any way (same in the obstacle-course game, EXCEPT in the obstacle-course game, if a player died, other players could touch them on their screen to revive them within a certain time limit before that player auto-used a limited number of revives).

So basically, my past games usually avoid having to directly sync stuff in real-time multiplayer because of these common issues I faced:

Players who weren't hosting would get jerked around on their screen at any latency because on the server, when receiving player input, I didn't back-track (physics-wise) the amount of latency, apply the input, then reapply the physics, and instead just tacked the input onto current positions, ignoring latency.
Why I didn't fix this: I have no idea how I should properly back-track "x-frames" (where x is almost guaranteed to be a floating-point number) and then apply input, and then reapply physics with multiple frames worth of input. Just... how would I even go about doing that??
Lag from syncing enemies' positions, targets, states, graphics, stats, etc.
Why I didn't fix this: I don't know the best way to efficiently keep up to 50 objects and their physics and AIs and stats all synced in a fast-paced real-time networking system.
Strange glitches and bugs usually crept into playtesting and I couldn't effectively back-track the source because *.exe GM games have unusual error messages that give no help at all.
Why I didn't fix this: I actually did usually just make my friend host, then connect with the source code project, so I could see errors more directly and get lines of code that they occurred on, but it didn't always help.

So, what are some tips (with explanations or examples, I don't necessarily need code as I am usually able to take what you say and turn it into GML myself) that you can give to me, and others trying to make real-time multiplayer games in GMS2? Thank you so much.

MishMash · May 26, 2018

So generally, networking architecture around your entire project can play a huge part in the practicality of systems like this. Sometimes the reason things can become hard to implement is because everything is "too" locked down due to wobbly design. I can't go into great detail right now (as it would take pages and pages to explain) but my project uses a few design patterns to abstract away the complexities of networking and make everything more manageable. In the future, when i'm less busy, i'm going to write a really thorough set of tutorials on this stuff, but for now, a general overview of what the system does:

- Special function to replace instance_create that creates automatically synced objects by allocating them a unique ID. Parent object used to coordinate this along with other properties. Basically, this boils down to if the object exists on the server, it'll exist on the client and have a unique network-safe identifier.

- These objects have special user events for both writing data to packets and reading data, all abstracted away so that you only have to worry about writing/reading the data. Updates are triggered using an update function called by the object at some point inside itself.

- Optional flags to only send updates to certain players

- A special control transference scheme. Clients can be given control of objects so that they are responsible for their processing. Functions exist to get the ID of the player responsible for control, and this is the instance who sends packet updates.
-- Note here that there is also a function to check if you are the server (server and client are the same app), so that the server can still override things if it wants to.

- A special function used to perform interactions on networked instances. This allows the client to send some data to the server and have that instance trigger an event. (This is very much paralleled to a POST request in HTTP). The code syntax for this makes it easy for the server to perform some action, and given objects can interact on themselves (for consistency), many things programmed using the constraints of interactions between instances will instantly work whether it is local gameplay or networked without any additional work. (e.g. an inventory system where adding/removing items is implemented as a series of interaction events will immediately work with the server-side design)

So the reason I have outlined a rough architecture design is because I think it is important to have one of these in place in order to be able to efficiently and easily implement things that optimise the network experience. Using the architecture outlined above, these are some of the things I can do to solve the problems you present:

Player position/jerking sync consistency: We give the player "control" of the instance so they are responsible for propagating updates. All of these updates still pass through the server anyway, and the server still maintains the "true" version of the player instance. However the server acts like a client plus a mediator. It first takes the input and approximates the players location, accepting the location the client tells them its at as fact. The server will however force reset the position if it deems the motion to have been invalid (rubber banding the player back to where they were). This isn't done with regard to a series of previous frames, but instead just looking at where the client was before the packet came in, and where they are now.

With regard to the network design stated above. In my game, we do give clients control over their own inventories so that the responses are instant, however if you wanted to be super safe about it, components for the player could be server-sided and the player would simply run interaction events on them to get them to perform an action. Interestingly in our project, given how interactions work, the clients inventory can be either server controlled or player controlled, functionally its the same and network-consistent. However when it is server controlled, it incurs a slight delay. This delay does become noticeable when interacting with other inventories such as chests, but at 200ms its mostly negligible.

There is also a client-side approximation for inventories. That is, if the client performs an action such as changing an item in the inventory, the item will visually change, however given the server maintains the true state regardless, upon successful completion of an item addition/move/removal, the server propagates out an update for that inventory. It doesn't matter if the clients visual representation was wrong, as in most cases, the update will carry out just fine, but from time to time when you have race conditions between two players, it could get changed. The important thing however is that given it is all back-ended by the server, items never get lost.
General lag with enemies/state: The question here is whether this is true lag due to too many packets being sent, or whether it is just the product of latency and delays between packets? The main trick I use here is to implement the enemies in such a way that their behaviour is easily deterministic based on their current position/state/motion. That is, they will generally behave in the same way and wont diverge too much. This means sending slightly more information when you update state, such as any properties the enemy needs, targets, animation, and motion so that you can perform an accurate client-sided approximation of the enemy. The only things that need to be consistently synced are health bars and whatnot. Generally however, you should have an idea of what needs updating a lot and what doesn't.

Similarly, you dont need to send updates for enemies that are off screen. The server should be able to deduce what the player can see (plus a little padding) and thus know which enemies are near that player. You also dont need to sync everyhting all the time. Generally, state changes and animation changes are the most visually important things. Underlying physics state and AI variables that the client doesnt actually use are less important. Most of the variables used by the server such as timers for state changes, AI decisions and tracking aren't important for the client. The client only needs to see the end results and end-decisions.
Understand your data budget: You would be surprised how much you can actually get away with sending. That is not to say being optimal is important, but modern network bandwidth supports a lot of traffic. Just opening Overwatch now and playing, it receives 22kb/s, and sends 4kb/s. To put that in a practical context, that's 5500 integers per second. That does give you quite a lot of wiggle room, especially given that most values don't require integers. You can also optimise values that are too big for shorts but not in need of integers by doing a little extra CPU work and sending a factor in a byte, and then using the short as an offset (basically putting the value into 3 bytes instead of 4). But yeah, it would be worth looking at how much data your game actually uses (using resource monitor). You can have a server to be sending out ~20kb per player. Note that games like Minecraft can be as high as 500kb/player.

The way you send data often matters more than the data itself. Whilst the underlying OS will do work to batch packets together, you can make its job easier by managing your own packet batching and sending out updates at less frequent intervals. For example, limiting sending to once every 5 frames (each player can have its own send-out timer). This doesn't reduce the data aniybt, but it can reduce the overhead of packet headers (especially the GM packet header if you haven't used the RAW setting). It also avoids clogging up the game with lots of small packet events firing off. This can be an issue caused where the lag itself isn't caused by too much data, but too high of a volume of packets.

If you want to get smart, you can make your own batching even smarter by having a form of packet consolidation. This means if you are heading entity updates with the entity ID, if you have a series of chained packets together, you can potentially get away with just writing the entity ID once, and sending a series of subsequent updates for that entity, rather than having each one have its own entity ID header. (This is something you can setup automatically if you create a good architecture as outlined above.
Finally the topic of strange bugs: Networking is something you can easily get lost in. Packet size traps where you are sending more data than is being read, causing the next packet handler to read incorrectly. Or just having everything be sort of slapped together without any coherence, lots of repeated code and inconsistency. Having a good architecture and network design pattern in-place will alleviate all of these things. If you ever feel like you are hacking code, or cheating your own systems, then the back-end system you have designed needs to change. Ideally, everything you need to achieve with networking should be provided for your game by the networking systems you have created. Whether that be syncing instances, connecting/disconnecting players, managing bandwidth, managing latency etc;

For me, the main thing I needed was just an architecture that localised the networking code for instances inside instances, and also had that ability for instances to perform interactions on each other. Once I had this, every possible system could be programmed using a series of interactions and that meant I KNEW that the code was guaranteed to both be server-sided, safe and consistent. With no chance of lost data, or unexpected behaviour.

A really good example of this is the inventories. All inventories only move items between each other. An item can also never exist outside of some form of inventory. An inventory transfers an item to another inventory by instigating a "transfer request" with a number of properties (such as the slot it wants to send to, and information about the item being transferred). When the transfer request is made by inventory A, it locks that slot so no further action can be made. Inventory B receives this request as an interaction. Inventory B then evalutes if it can validly receive the item. If it can, it places the item in its own inventory and locks that cell. It sends back a success response to inventory A (can also send a fail if there was no space for the item). Upon receiving the response (also as an interaction), if failure, inventory A simply unlocks the cell, keeping the item in the inventory. If success, Inventory A then attempts to remove the item from its own inventory. If for whatever reason the item cannot be removed (or something has changed since the initial request), it sends a failure response back to Inventory B, who subsequently removes the locked item they had added after the initial request. If successful, it'll send a transfer complete to inventory B who can then finally unlock that cell that has the new item in it.

^ Note this exchange works for items, naturally if you had something secure like money in the real world, then the process would be more secure, but it follows a similar series of exchanges and responses present in ecommerce transactions.

Another place where this design is present is in interfaces. For things like NPC chat, all of the content is generated on the server side and sent to the player. The player doesn't actually do any local processing other than sending interactions to the server. When they press a button, i.e. clicking on a chat response, or buying an item from an NPC shop. They send an interaction event to the server, the server then handles the request. (This is very similar to how webpages would work), and the server generates a new page and sends that back to the client. In the case of shops, all of the transaction such as deducting money from the players money value is evaluated on the server side. The item also gets added to the players inventory on the server side an the update gets propagated back to the player.
The player is never given control to do anything other than that, but in terms of its
implementation it is all rather straight forward. There is nothing crazy complicated going on, it is just making sure that code runs in the right place and things happen in the right order.

Like any design pattern associated with model-view-controller, you can treat the client as a "view" into the world of the server, where the server does all of the heavy lifting. The client just sees the results and never "guesses" at anything or does its own processing. The client shouldn't really be telling the server what happened and the server state reacting (aside from in a very limited set of scenarios such as player movement, however given this is a controlled case, the server can perform secondary verification. This is in much the same way that in an FPS like Counterstrike, at some point the player has to tell the server what direction they are facing, because the server would never have a perfect representation of that based on mouse movements alone, but for anti-cheat mechanisms, the server can evaluate changes in aim to verify if it was a "reasonable" change. These pieces of data are also often sent at a much higher resolution).

Also note that i'm not making assumptions about your ability as such, you probably understand all of this well, however I'm just trying to be consistent and paint the way I approach networking. Once you get into good habits with networking, everything just starts flowing nicely. I really love that feeling when I add a new feature now and just know it will instantly work in multiplayer because I have programmed it while adhering to good design practices, and modelling the solution as a series of interactions between objects, rather than just having code immediately do something then and there.

Anixias · May 26, 2018

What a great response! Thank you. I will attempt to make an abstracted networking architecture that supports adding new features without much hassle. If I have trouble, I’ll ask about more specific stuff! Is it cool if I PM you my questions (if/when I have them)?

MishMash · May 26, 2018

Yeah sure, though as Nocturne would say, if its possible to ask here, then it may also serve to help other people (that's partly why I try and make my posts as detailed (#waffly) as possible, so that it covers a more general case and might be a primer for someone else)

meseta · May 28, 2018

Hi Anixias, there's really two schools of thought when it comes to dealing with lag/syncing in netcode:
1. Make everyone lag equally (broadly speaking this is what input delay is in Lockstep)
2. Allow lag, but compensate for its effects in terms of hit checking (broadly speaking, this describes lag compensation in async server-client topologies)

Which one you pick depends a lot on how your game works. I'll try to describe both in brief.

Lockstep
In lockstep (which is generally used in P2P topology without an authoritative host), the netcode ensures that each client is processing the same game step at the same time, so that no client is seeing a different version of the game state from each other. This is pretty much essential for certain games, like fighting games, in which you don't want lag to cause one of the players to become an easy target for the other player.

The way lockstep deals with lag is primarily to bake-in that network latency in, in the form of input delay. Player's inputs are deliberately delayed to be interpreted in a later netcode frame. This gives the data packet enough time to reach its destination before it's needed. This way, the netcode doesn't have to wait for that transmission time, and can process previously received input frames. This results in the controls feeling slightly laggy, but at least the players are now generally not teleporting around the screen as much as they were.

However, on the offchance that the network experiences a lag spike, the games would have to freeze in order to wait for the data to catch up. To deal with this issue, lockstep games can implement a prediction step, where if input frames are not received, the game makes predictions on what the inputs might be. But this causes a desync issue, where it's possible for something to happen with this predicted state on one player's screen that isn't the same as on another player's screen. Since there usually isn't an authoritative host with lockstep, this can result in a desync. To avoid this, when the real inputs eventually arrive, the game making prediction steps would roll-back the game state, and then re-simulate forward using the real inputs instead of the predicted ones, and then update the game state with that difference. This is the basis for Lockstep-Rollback netcode, and is used by the popular GGPO.

I have a demo implementation of deterministic lockstep - rollback on my itch page you can poke around with: https://meseta.itch.io/lockstep it implements input delay as well as the rollback mechanism, which decouples GM's step event from netcode frames, allowing the netcode to run at a separate FPS as the main game/animations. I also explain these concepts in more detail in Part 3 in my blog series on netcode concepts: https://medium.com/@meseta/netcode-concepts-part-3-lockstep-and-rollback-f70e9297271 Parts 1 and 2 are worth a read as well.

Async server-client
In async server-client (which I think describes your scenario, though I think yours would count as a client-host), lag compensation is usually achieved by having the server maintain a history of player locations. When a player does a hit check, the server would look back in its history of player locations, at a time equal to the hit-checking player's network latency, to try to estimate what the client was seeing when they did the hit check, to determine whether the hit was valid or not.

I explain these concepts in Part 4 in my blog series, but that hasn't been released yet. But, I'm happy sharing the draft page for it: https://medium.com/@meseta/netcode-...s-server-client-lag-compensation-2e5a98e4105f

I hope this is some useful background reading for your work on netcode. My aim here is to demystify netcode for Gamemaker devs, and hopefully encourage more people to get into it. The blog series is on hiatus at the moment while I work on some extensions and contracts, but I'll get back into it soon.

GameMaker Lag-Friendly Real-Time Networking Tips

Anixias

Member

MishMash

Guest

Anixias

Member

MishMash

Guest

meseta

Member