UDP peer-to-peer networking issue

Fanatrick

Member
Hey guys,

Last few weeks I've been trying to set up everything necessary to get the online multiplayer going for my project. So far I've been able to setup a relay server that takes care of UDP hole-punching for the clients. It properly manages connection between 2 peers, but anything more fails to work.

Here's what happens:
  • John, Tony and Lucy want to play a deathmatch game online
  • John creates a lobby and connects with Master
  • Master logs his IP-PORT(J)
  • Tony tries joining John's lobby, sends his IP-PORT(T) to Master
  • Master sends John IP-PORT(T) and Tony IP-PORT(J)
  • John and Tony send each other packets on their respective points and the connection is established
  • NOW, Lucy wants to join John's lobby, connects to Master and Master sends them their IP-PORT(J/L)
  • She connects with John, then John sends her Tony's IP-PORT(T) and sends Tony her IP-PORT(L)
  • Lucy and Tony send each other packets on IP-PORT(T) and IP-PORT(L) respectively, but ultimately can't establish a connection
I always thought just a single port needs to be punched and then that specific player can receive packets from every other player on that single port. Is this not how OSI handles UDP?

In my experience from playing online p2p games, I recall that even though I couldn't connect to friend_A, we could still establish the connection by just joining another friend_B's lobby and still play the game together peer-to-peer.

In case I'm wrong, what if 2 players in the lobby can't hole-punch a mutual connection between them but can do so with everyone else? This never happened to us while playing games on Steam (Rocket League, No Heroes Here, Worms WMD), as we could always find at least one player that could host us all properly in a peer-to-peer game.

I'm almost certain this is not the case, but do battle-royale kind of games hole-punch each and every player? I was under the impression that each peer needs just a single global port to receive data through from any player, but so far my implementation hints that I've got something wrong.

This is what happens summed up:
(1.1.1.1:5000) A: Hosting at 1.1.1.1:5000
(2.1.1.1:6000) B: Joins by sending to (1.1.1.1:5000) A
(1.1.1.1:5000) A: Accepts by sending to (2.1.1.1:6000) B
(2.1.1.1:6000) B: Handshake with A succeeded (10/10), peer added
(Lobby players: A, B)
(3.1.1.1:7000) C: Joins by sending to (1.1.1.1:5000) A
(1.1.1.1:5000) A: Accepts by sending to (3.1.1.1:7000) C
(3.1.1.1:7000) C: Handshake with A succeeded (10/10), peer added
(Lobby players: A, B, C)
(1.1.1.1:5000) A: Broadcasts a list of players to everyone connected to it
Sends (2.1.1.1:6000) B to (3.1.1.1:7000) C
Sends (3.1.1.1:7000) C to (2.1.1.1:6000) B
(3.1.1.1:7000) C: Handshake with B failed (0/10), peer removed
(2.1.1.1:6000) B: Handshake with C failed (0/10), peer removed
Final state of the lobby:
(Lobby players A, B, C)
A can communicate with B and C on their ports, but B and C can only communicate with A
The lobby is sad.

I've been fiddling with this for days to no avail. The hosting peer basically needs to send every other peer the IP and Port of any new player that joins the lobby but apparently they can't communicate through same ports in-between them afterwards. Since my approach seems to be flawed, any suggestions as to what needs to be changed to get this to work?

Thanks in advance.
 

Tsa05

Member
Does your handshake involve setting up a socket?
Wondering because if so, you'd be able to connect A-B and A-C, but unable to connect B-C or C-B because a socket already exists on that port (for connecting to A). In theory, you'd just broadcast over the existing socket (C has a socket created for connecting to A, B broadcasts to the IP/port of C that it receives from A, C picks up the broadcast on the socket it created for A).
 

Fanatrick

Member
Does your handshake involve setting up a socket?
Wondering because if so, you'd be able to connect A-B and A-C, but unable to connect B-C or C-B because a socket already exists on that port (for connecting to A). In theory, you'd just broadcast over the existing socket (C has a socket created for connecting to A, B broadcasts to the IP/port of C that it receives from A, C picks up the broadcast on the socket it created for A).
I'm using the very same socket for each peer object. The handshake is a simple exchange of packets to sync the connection and see if it's stable, calculate ping, etc. Each peer is an object that makes sure it handles only the data from and to this specific player, a server or a client object (depending on if hosting or joining) passes it's punched socket to it upon creating it. This works for the player hosting the game, but clients can't communicate in-between even though I'm positive they're sending to the very same IP-PORT combo as the host. I made sure to log everything I could, couple of my friends even streamed so I could make sure, we made sure data is being sent and not received through wireshark, etc.

If you can confirm that I'm not doing anything wrong in theory, I'll return to debugging.
 

Fanatrick

Member
Follow up to the previous post:




It must be something wrong with my overall implementation. At this point I'm just looking for confirmation D:

(All ip's are from an older test, not valid anymore)
 

Simon Gust

Member
Not exactly sure I follow your describtion.
But here's how I have it and the last time I tested it, it worked well with 3 or more clients. I did only test it with the client being the same ip because I had no friends.
For this I just opened additional IDEs that compile the same game.

In my networking event I have 2 major code blocks. One for the host and one for the client.

when you are the host and someone types in the hosts ip to join
Code:
case netplay.request: // someone joined your ip 
{
    show_debug_message("net: request");
   
    /*
    client requested invitation to this ip, they want to join. 
    This code fires...
    */
   
    // return the message, and recommend all already existing clients in the message
    var buffer = buffer_create(1024, buffer_fixed, 1);
    buffer_write(buffer, buffer_u8, netplay.answer);
    buffer_write(buffer, buffer_u8, my_id); // for testing purposes (because same ip)
   
    // write hosts data to the packet
    buffer_write(buffer, buffer_string, my_ip);
    buffer_write(buffer, buffer_u16, my_port);
    buffer_write(buffer, buffer_u8, my_id); // for testing purposes (because same ip)
   
    // fill all already existing client's data to the packet
    with (obj_client)
    {
        buffer_write(buffer, buffer_string, headless_ip);
        buffer_write(buffer, buffer_u16, headless_port);
        buffer_write(buffer, buffer_u8, headless_id); // for testing purposes (because same ip)
    }
    buffer_write(buffer, buffer_string, "null"); // terminator
   
    // send packet to incoming client
    var packet = network_send_udp(my_socket, net_ip, net_port, buffer, buffer_tell(buffer));
    if (packet == -1) show_debug_message("packet send failed, check server port and ip");
   
    // cleanup
    buffer_delete(buffer);
   
    ///////////////////////////////////////////////////////////////////////////////////////////////////
    /*
    now, that the new client is informed about the hosts and every already existing client, 
    all the existing clients need to be informed about the newly joined client
    */
   
   
    // tell every client to spawn the new client instance
    var buffer = buffer_create(1024, buffer_fixed, 1);
    buffer_write(buffer, buffer_u8, netplay.answer);
    buffer_write(buffer, buffer_u8, my_id); // for testing purposes (because same ip)
   
    // fill data of the new client
    buffer_write(buffer, buffer_string, net_ip);
    buffer_write(buffer, buffer_u16, net_port);
    buffer_write(buffer, buffer_u8, net_id); // for testing purposes (because same ip)
    buffer_write(buffer, buffer_string, "null"); // terminator
   
    // send buffer to all already existing clients
    var my_sock = my_socket;
    with (obj_client) // loop through every client instance
    {
        var packet = network_send_udp(my_sock, headless_ip, headless_port, buffer, buffer_tell(buffer));
        if (packet == -1) show_debug_message("packet send failed, check server port and ip");
    }
   
    // cleanup
    buffer_delete(buffer);
   
    ///////////////////////////////////////////////////////////////////////////////////////////////////
   
    /*
    now, create the new client instance to store ip and port and stuff in them,
    with that, everyone can communicate with each other very easily
    */
   
    // finally, create a headless player of your own
    var inst = instance_create(0, 0, obj_client);
        inst.headless_ip = net_ip;
        inst.headless_id = net_id; // for testing purposes (because same ip)
        inst.headless_port = net_port;
}
break;
When you are one of the clients already in the lobby or the new client
Code:
case netplay.answer: // return connect
{
    /*
    either you're the new guy and are informed about what clients already exist in the lobby
    to create them and store their info
   
    or...
   
    you are already in the lobby and get the data of the new client that just joined.
   
    in both cases, client instances are created with ips and ports and stuff
    */
   
    show_debug_message("net: answer");
    while (true) // packet size is unclear -> infinite loop through buffer until "null" string is reached
    {
        var ip = buffer_read(net_buffer, buffer_string);
        if (ip == "null") break;
        else
        {
            show_debug_message("net answer: client object created");
            var port = buffer_read(net_buffer, buffer_u16);
            var nid = buffer_read(net_buffer, buffer_u8);
            var inst = instance_create(0, 0, obj_client);
                inst.headless_ip = ip;
                inst.headless_port = port; 
                inst.headless_id = nid; // for testing purposes (because same ip) 
        }
    } 
}
break;
When someone joins, their data is stored in fake player objects that act like sockets a bit.
In them, ip and port are stored and reacts to updates from the real clients.

In this way I don't need any sort of data structure and while still being able to loop through all clients easily.
 

PNelly

Member
Two thoughts to start with.

First, it's not clear to me why you desire to establish peer to peer connections between the game session clients. It runs counter to the way that multiplayer games are typically designed - to minimize the complexity of maintaining the game state over the network, the game session host (John in the case of your example) should be the only machine that will be connected to all session members. All other session members will only have connections with John. The session members will send their updates (or "requested" updates) of their actions to John's machine. John's machine will then compute the new game state, and share that new game state with all of the game session clients, who accept it without question.

If all of the session clients are connected to each other and sharing information about the game state, it generates enormous confusion about what's really going on in the game. Each P2P connection will experience differing amounts of drift because their latencies and bandwidth will differ, not to mention hardware level differences like the same floating point calculations producing different results on different clients. Without putting one machine "in charge" of deciding what reality is, it becomes next to impossible to present a coherent picture of what's going on to all the players at once.

That said, there are legitimate reasons to establish peer to peer connections between the UDP clients. I have my own implementation where I've done this to facilitate a session host fail over, so that in case John's machine poops out or he decides to rage quit, the game can continue uninterrupted with a new session host (but without John of course). The clients don't share any game state information, they just initiate a connection and keep it alive in case it's needed later for a new host.

Second, and more towards the nuts and bolts of your problem, I do have a suspicion but I could be wrong. If you really are sharing the correct port and IP information of Tony and Lucy, then they should connect to one another without issue. This is pretty sticky because the correct port and IP info is actually contextual to the network topology between Tony and Lucy. If Tony and Lucy are separated over the public internet and are behind separate NAT's (routers), then the exchange of public port and IP tuples from the master server ought to work. However, if Tony and Lucy reside behind the same NAT (are co-located in the same private network), then the public port/ip information will only work for Tony and Lucy if that network's NAT/router supports a feature called UDP Hairpinning, which allows the router to recognize the public IP as its own IP and route the packets to the correct internal destination. Unfortunately you can only count in this feature being present in newer routers, and many people are running older stuff. So, if Tony and Lucy reside on the same private network, then it's best to use their local IP's and ports to connect to one another (this will work for older and newer routers).

If I were in your position I would run a sanity test by spinning up the master, John, Tony, and Lucy all on the same computer. If you can establish all the desired connections then it indicates that there's some networking nuance you haven't addressed correctly (like the one I described above, or maybe something else). If you cannot establish all the desired connections when all are on the same machine it would indicate a different problem with your implementation.

Best of luck, you're in the jungle now.

- Pat
 

Fanatrick

Member
Thanks for the insight fellows!

First, it's not clear to me why you desire to establish peer to peer connections between the game session clients. It runs counter to the way that multiplayer games are typically designed - to minimize the complexity of maintaining the game state over the network, the game session host (John in the case of your example) should be the only machine that will be connected to all session members. All other session members will only have connections with John. The session members will send their updates (or "requested" updates) of their actions to John's machine. John's machine will then compute the new game state, and share that new game state with all of the game session clients, who accept it without question.

If all of the session clients are connected to each other and sharing information about the game state, it generates enormous confusion about what's really going on in the game. Each P2P connection will experience differing amounts of drift because their latencies and bandwidth will differ, not to mention hardware level differences like the same floating point calculations producing different results on different clients. Without putting one machine "in charge" of deciding what reality is, it becomes next to impossible to present a coherent picture of what's going on to all the players at once.
Peer to peer connection between clients is not something new or out of the norm on the market. There are multiple reasons to at least off-load some data directly to a "client" peer, most obvious one being removing a whole trip's worth of latency between clients. Another less obvious one is that this authoritative approach of establishing consensus in real-time introduces input lag and/or rubber banding which lets the host wipe the floor with the opposition in most cases, as well as breaking the experience overall in a fast-paced action game alike the one I'm developing. Most notorious examples of this as of lately were Super Smash Bros. games and For Honor. I'll be focusing on mitigating these imperfections with interpolation and prediction methods to the point where these differences in sync would be close to negligible, rather than creating a flawed experience. On good conditions input lag is much more noticeable than front-end interpolation/prediction errors.

This isn't to say this approach isn't flawed in other areas, just like you mentioned. This is just what I believe is the best of both worlds. There's no such thing as a magic bullet when it comes to these things and I am well aware of that.

Second, and more towards the nuts and bolts of your problem, I do have a suspicion but I could be wrong. If you really are sharing the correct port and IP information of Tony and Lucy, then they should connect to one another without issue. This is pretty sticky because the correct port and IP info is actually contextual to the network topology between Tony and Lucy. If Tony and Lucy are separated over the public internet and are behind separate NAT's (routers), then the exchange of public port and IP tuples from the master server ought to work. However, if Tony and Lucy reside behind the same NAT (are co-located in the same private network), then the public port/ip information will only work for Tony and Lucy if that network's NAT/router supports a feature called UDP Hairpinning, which allows the router to recognize the public IP as its own IP and route the packets to the correct internal destination. Unfortunately you can only count in this feature being present in newer routers, and many people are running older stuff. So, if Tony and Lucy reside on the same private network, then it's best to use their local IP's and ports to connect to one another (this will work for older and newer routers).

If I were in your position I would run a sanity test by spinning up the master, John, Tony, and Lucy all on the same computer. If you can establish all the desired connections then it indicates that there's some networking nuance you haven't addressed correctly (like the one I described above, or maybe something else). If you cannot establish all the desired connections when all are on the same machine it would indicate a different problem with your implementation.
Thanks for confirming this. Our testing machines are all behind their own NAT's. The system in question has private network fallback in place which works so far without issues. I performed sanity testing to the point that testing my own would be more viable, so I created this topic in the first place.
Still unsure what might be causing this. Appreciate the effort though!
 
Last edited:

PNelly

Member
I gotta get to bed so I'll be a little less long-winded this time.

I just re-read your original post a little more carefully:

I always thought just a single port needs to be punched and then that specific player can receive packets from every other player on that single port. Is this not how OSI handles UDP?
If I'm understanding what you're saying here, then no, that isn't the case. Each peer to peer connection needs to be hole punched individually. This is because the router is making note of the external destination of outbound packets passing through it, so it knows to allow inbound packets through that are coming from that same place. If you setup a hole punch from Tony to John, then Tony is not setup to communicate with Lucy, because Lucy's endpoint appears different to Tony's NAT, and Tony's router will drop any packets that are inbound from Lucy.

Fortunately, if that is in fact the problem it shouldn't be very hard to solve. It sounds like you've already shared around everyone's public facing port and ip info, so it should just be a matter of timing simultaneous hole-punch messaging between Tony and Lucy.

Let me know whether I'm interpreting correctly.

Also, if you intend to go with the peer to peer game state management, more power to you if you can successfully implement it. I'm still not sure where I'd start trying to resolve disagreements about state if there isn't going to be an authoritative arbiter. If you know of any reading material on that I'd like to see it.
 

Fanatrick

Member
If I'm understanding what you're saying here, then no, that isn't the case. Each peer to peer connection needs to be hole punched individually. This is because the router is making note of the external destination of outbound packets passing through it, so it knows to allow inbound packets through that are coming from that same place. If you setup a hole punch from Tony to John, then Tony is not setup to communicate with Lucy, because Lucy's endpoint appears different to Tony's NAT, and Tony's router will drop any packets that are inbound from Lucy.

Fortunately, if that is in fact the problem it shouldn't be very hard to solve. It sounds like you've already shared around everyone's public facing port and ip info, so it should just be a matter of timing simultaneous hole-punch messaging between Tony and Lucy.
Everyone's public ip-port combo is sent successfully to all peers, those peers then send packets to each other's destinations in hopes of establishing a handshake, but it fails.

Here's how it's setup:

- Alice creates a lobby
- Instantiates objServer. This object creates TCP and UDP sockets
Code:
socket_tcp = network_create_socket(network_socket_tcp);
socket_udp = network_create_socket(network_socket_udp);
- TCP socket is used to communicate with a public master server that keeps all the lobby data and serves as a relay for UDP punch-through
- UDP socket sends packets to master so we can identify the port. Master saves this port and sends it to incoming clients
- The lobby is successfully created
- Bob wants to join Alice, instantiates objClient. This object creates TCP and UDP sockets the same way as server (Alice)
- Bob finds Alice in lobby list supplied by master and sends UDP packet to master so we can identify the port
- Bob also sends a request via TCP for master to join him with Alice
- Master sends necessary ports and IPs and the two start sending each other arbitrary UDP packets in hopes of punching
- The punch is successful and connection is established. Both parties create objNetPlayer objects to which they pass all the necessary connection data (UDP socket id, IP, Port, username, etc). This now functions as a peer object meant only for that respective player
- Carol wants to join the lobby to play with Alice and Bob
- Instantiates objClient, does everything the same as Bob and the connection between Alice and Carol is established
- Alice received a new player in the lobby (Carol), so she needs to notify Bob
- Since Bob is already connected, Alice also needs to notify Carol
- Alice sends Carol's IP/Port to Bob and Bob's IP/Port to Carol
- They receive each other's IP/Port data, create each other's objNetPlayer peer objects and immediately start sending UDP packets to each other in hopes of establishing a handshake
- They time-out due to no data being received from their objNetPlayer peers
- Alice did the same for Bob and Carol, as Master did for Alice and Bob, yet Bob can't receive data from Carol and vice-versa

Hope this clarifies everything. Thank you very much for helping out!
 

PNelly

Member
They way your events are ordered makes sense, and (appears to me) ought to work if you're sharing the right connection data. Could you explain a little more about your specific network setup? Where the different peers are located and whether they're behind their own NAT's? Where you did the wireshark captures?

I'm seeing these as the public-facing ip's and ports from the successful connections to the host player:
Daki (session host) - 89.172.243.104:6510
Ichiban (session client) - 213.149.61.212:52274
Pizda (session client) - 79.227.156.73:49668

When I look at your two screen captures of wireshark data, I believe I'm seeing Pizda's local ip and port as 192.168.0.177:49668. If that's the case then I think something is amiss because Pizda is recorded as having port 49668 as both his private and public facing endpoint. You would expect those two numbers to be different if Pizda is communicating through an NAT and the wireshark capture happened within Pizda's local network.

(Woohoo post number 250!)
 

Fanatrick

Member
Sorry for the late reply, I've managed to fix all the issues over the course of past couple of days. Some of it was one of our NATs being *outdated* which limited what we could scan, and some of it I don't seem to be able to pinpoint. The game was properly receiving packets from a peer but on a different port than what was broadcasted to it. This only occurred on that specific node, and what I did was just keep the overwritten port and receive data from there. I tinkered with other things like keeping the master connection in order to sync the peers properly and trying a brute-force approach from the first broadcasted port (from the tests we had usually the port we actually could communicate on would be +n 0<n>100). So far we couldn't break it and it's working successfully for all of us no matter who the host is. I still need to test this further and hopefully we can try other NAT configurations soon.

Thanks for the help!
 
Top