Advertisement

boost::asio causing seemingly random crashes

Started by May 21, 2017 09:13 PM
15 comments, last by JackOfCandles 7 years, 5 months ago

I would also recommend inspecting the code very carefully for bad use of memory and pointers. Just from those screenshots above there are far more 'new' and 'delete' statements than I am used to seeing in modern programs, which implies this is quite dangerous code. I certainly wouldn't expect 2 'new' calls in an input-polling function. Consider whether these can be changed to local or member variables, and if they do have to be dynamically allocated, consider using standard library containers such as vectors, or using smart pointers to manage the lifetime.

I just wanted to update this, in case anyone comes across it in the future looking for answers. I'm pretty sure I know what the problem was, though I was unable to verify for certain. I was using the same port number for both the client and the server. So when the client and the server were running on the same machine, it would sometimes be writing to both the at the same time, causing them to step on each other. I believe this would explain all of the symptoms I was experiencing. The fact that it only happened when running both on the same machine, the fact that it only happened during the initial synchronization process (all other types of communication happens in a more ordered manner). As I said I wasn't able to verify for certain, because I discovered this in the process of switching from boost::asio to SDL_net, and I would have had to spend a ton of time reverting a lot of changes (my reasoning was SDL_net is more lightweight, I didn't really need the advanced functionality offered by boost::asio). To think, such a minor, stupid mistake is probably the cause to such a massive headache. Ugh...

 

Thank you for all of the advice offered in this thread, it is much appreciated!

 

Advertisement
Quote

I was using the same port number for both the client and the server. So when the client and the server were running on the same machine, it would sometimes be writing to both the at the same time, causing them to step on each other.

 

That doesn't sound quite right to me.

If you use REUSEADDR and/or REUSEPORT, more than one socket can send from, and receive from, the same port number. The kernel will allocate incoming data to one of the sockets in some way that is not usually deterministic. However, nothing will "step on" anything else in this setup. It's totally supported by the kernel API and system libraries.

Maybe what would be happening would be that one of your programs (client, for example) would send a message it thought was to the server, but the kernel would turn around and hand it right back to the client, who would then not process it correctly.

If that is the case, and causes your program to crash, then your program has a remotely exploitable bug and running your game would expose you to possible shenanigans from the greater internet. No matter what kind of data arrives on a socket, and in what order, your program should never mis-process the data or crash. For random noise, you should detect that you don't know what the packet means, and stop processing it early. For packets that seem correctly formed but have the wrong "meaning," you should detect that they don't make sense in the context ("server packet received on client") and log the problem and ignore the packet. Similarly, if packets contain fields that are too short, or too long, for the intended usage, you should detect this, mark the packet as corrupt, and stop processing. If you ever let data under the control of a remote computer point your program at uninitialized data, or past the end of the packet, or into a part of your program that hasn't been initialized yet, your program has a remote code execution vulnerability.

 

 

enum Bool { True, False, FileNotFound };

I'm not sure if I am using REUSEADDR or REUSEPORT, I've not heard of those. I should note that while I said the client and server are running on the same machine, it is more accurate to say they are running in the same process. I have two GameStateManager objects and I call update on both inside the main game loop. I'm not sure if that makes a difference though. I do have some packet verification code. I generated a protocol ID that I check for, for example, and there is a packet header which contains the size to verify it is not greater than the max size, but there is no distinction between a packet for a client and for the server. Well that's not true actually, I store a packet type value, which can be used to implicitly determine if it is a type meant for the client or for the server, and it would get ignored in the switch statement in the processPacket() function.

You are saying that the client and server are using the same port. You cannot use the same port for two sockets on the same computer unless you turn on some feature to allow sharing of ports.

Is it perhaps the case that your program is using a single socket, for both client and server, when running inside the same process? If so, there is no way you can reliably pass messages back and forth between client and server.

Is it perhaps the case that the client and server GameStateManager use the same socket object, but run in different threads? If so, they could totally have thread-unsafe code in them, that would cause corruption in your address space, totally separately from what networking you're actually using.

It sounds to me as if you've bitten off a slightly bigger chunk than you can reasonably chew at this point -- you're using advanced, asynchronous libraries, networking, and multiple logical processes in a single physical process, all of which are pretty advanced concepts and require significant experience to get right. You may find that you're better off if you simplify the code, as follows:

  1. Only run one client or one server in one process. It's OK to compile in the code for both, as long as you only activate one, perhaps based on command line options.
  2. For a server, create a socket, and bind it to a known port, on address INADDR_ANY.
  3. For a client, create a socket, and do not bind it; instead use sendto() to send the data to the server.
  4. To poll your network for incoming data, set the socket to non-blocking when you create it, and simple call recvfrom() in a loop until there are no packets to dequeue.
  5. This means your code can use a single thread per client and per server.

With the requirements of ASIO out of the way, and the requirements of multi-threaded code out of the way, and the requirements of shared client/server in the same process out of the way, this should make it possible for you to concentrate on the networking code, and make it easier to reproduce and debug whatever problems you're having.

enum Bool { True, False, FileNotFound };

I did already remove boost::asio from the equation, replacing it with SDL_net (and set a separate port for client and server), and I am no longer having any problems. That's not necessarily conclusive that the problem was because of the boost::asio stuff, but that's what I'm leaning towards. As it stands, everything seems to be working correctly at this point.

This topic is closed to new replies.

Advertisement