Network Thread & Message Queue?

Started by
15 comments, last by Bluntest 3 years, 10 months ago

Hello everyone,

I'm struggling with some infrastructural logic hopefully you guys can give me a hand.

Given a single thread approach to networking, the sending and receiving is happening in the same loop. ( I'm using ENet ). The idea is I have a game thread that pulls from a message queue pushed by receive events and a loop inside of my network thread that pulls messages out of a queue pushed by my game thread.

Psuedo code:

NetworkThread()
{
	Messages[] messagesToSend;
	while(true)
	{
		Poll(event);

		if(event type == Receive)
			Game.Messages.Enqueue(event);

		//this loop is going to block my entire network until its done
		foreach(m = messagesToSend.dequeue())
		{
			send(m)
		}

	}
}

GameThread()
{
	Messages[] msgs;
	Player[] players;

	while(true)
	{
		while(m = msgs.dequeue())
		{
			//do something with message
			if(m is PlayerMovedPacket)
			{
				players[m.id].Move(m.position);
				players[m.id].isDirty = true;
			}

		}


		//with 200 players connected this is going to add 40,000 packets to the message queue to send out per tick
		if(100ms has passed) // tick, send move packets out to everyone
		{
			foreach(Player sender in players)
			{
				if(p.isDirty) //if player has moved
				{
					Packet packet = PlayerMovedPacket(sender);
					
					//send 'sender' movement to all other players
					foreach(Player receiver in players)
					{
						Network.messagesToSend.enqueue(receiver,packet);
					}
					p.isDirty = false;
				}
			}
		}
	}
}

Given the example above my send loop has to send every packet out faster than the 100ms otherwise the message queue will just build up and the send loop will never finish resulting in blocking the entire network.

One solution I've thought of is not having my game thread add anymore ‘tick’ related messages to the queue while the send loop is running but I'm not even sure thats a good solution.

Thanks

Advertisement

Have you measured packets taking a long time to send?

In general, “sending” just means “move to a buffer in the network interface in the kernel.” It doesn't actually wait for the bits to be on the wire.

enum Bool { True, False, FileNotFound };

I don't know what you mean by that.

I wrap my send loop in a timer and see how long it takes to execute.

Yeah, the only place I've seen long send times is baremetal embedded (and even then you can generally switch to a DMA mode and be fine). Send calls should be quick, and 100ms is a huuuuge amount of time in context.

If you're doing 40,000 send calls though, maybe you are just hitting overhead. You can do some profiling and see if batching those 40,000 messages into fewer, larger messages is faster.

A few things that came to mind since you are using ENet:

What does “send(m)” in your pseudo code correspond to in your actual code? If it is a call to enet_peer_send(), keep in mind that this function does not actually send anything onto the wire, but rather queues a packet inside ENet to be sent next time enet_host_service() or enet_host_flush() is called. This means that enet_peer_send must allocate memory for your packet which is why calling it 40 000 times might cause a slowdown.

In any case sending that many individual packets is going to be problematic, even if ENet can merge packets together. What you'll want to do is merge data yourself and send it in larger batches.

@Archduke

The library I'm using aggregates the messages sent ( their send call doesn't actually send straight away ). 200k messages in this example would result in around 8k real send calls. I believe the messages are already batched as much as they can be without passing MTU.

@guywithbeard

the send call in my example is indeed enet peer send and im aware it doesn't actually send right away. I actually just assumed enets packet merging would be super efficient. I'll definitely give merging packets myself a try.

If you guys know of any examples that showcase extremely fast sending of tons of small messages that would be huge to look at. ( Like 1000^2 messages )

Presumably you should be able to put your sending on a separate thread, then, which keeps a queue of messages-to-send, and works off of it.

The act of pitching the messages across that barrier to that queue should be fast. (You should be able to have an interface that takes array-of-messages, so you don't need to lock/unlock for each message.)

Still, 100 milliseconds sounds like a lot. What are you doing with all that time? What does the profiler say?

Assuming that all the work is necessary, and there's no massive inefficiency to just fix to make the problem go away, there's also the case that computers do have finite resources. At some point, you will be offered more load than you can accept. Your choices then is one of three:

  1. Keep trucking, running further and further behind, in a death spiral. This is what most software will do by default, and it's the worst option.
  2. Decide that you're irrevocably far behind, throw up your hands, and exit with an error. The operator will take care of it. (adding more server capacity or whatever)
  3. Shedding load – kicking the players who joined last, temporarily removing the ability to log in for new players, and generally “do less” until the load abates.

Having good load shedding is generally the most robust way to do it, as long as you also have good operational awareness so you know when it happens, because it will feel bad to the people who get kicked/can't log in, so you want to not stay in that mode for a long time.

Implementing 2. is usually quite simple and also very effective. When the server goes down, the “overload” problem takes care of itself, and it's generally a visible enough event that you don't run the risk of missing it. Everybody who was playing will be in the “I got kicked” bucket, though, so it's obviously more disruptive for players, especially if it happens with any kind of frequency.

Don't do 1.

enum Bool { True, False, FileNotFound };

@hplus0603

Still, 100 milliseconds sounds like a lot. What are you doing with all that time? What does the profiler say?

My profiler says calls to enetpeersend is taking the most time. In the most basic test I can i've gathered a real data set to maybe help better explain my issue.

My test results per tick average:

Send loop time: 16ms
Real send calls: 5200
Messages sent ( calls to enet_peer_send ): 160,000
Peers: 400

The logic I'm using to send is simply sending a 12 byte message 400^2 times. 400 messages per 400 peers.

If I were to group my packets by myself I could send out these 160k messages in less than 5,200 real send calls. I'm not sure if this is efficient or if ENet is already aggregating the best way possible. I also don't know if 5,200 send calls should be taking this long since multiple people have told me network IO itself isn't slow.

Shedding load – kicking the players who joined last, temporarily removing the ability to log in for new players, and generally “do less” until the load abates.

Thanks for this info. Thinking back at playing mmos I think this is what usually happens when trying to connect to a busy server. I will keep this in mind.

Thanks

PS: I'm already aware of the N^2 problem and area of interest. I have already implemented an AOI system that drastically reduces send calls and send time. The point of this is trying to get the most basic, raw networking setup properly ( because I don't know if this is proper ) and efficiently before moving on.

I still think something's going wrong here. It shouldn't take 100 milliseconds to mush 400 packets of 12 bytes each into a 4800 byte packet, and send, for 400 remote endpoints. That's a tiny amount of memory, and not that much for networking (unless your network card is old and slow, perhaps?)

ENet is open source; you should build it from source and profile it so you can tell where on the inside of ENet you're spending the time. For example, I'd double-check whether those 400 messages actually do get merged into a single packet.

enum Bool { True, False, FileNotFound };

Thanks for the replies @hplus0603 I really appreciate it.

The test results I posted above show that 400^2 worth of 12 byte packets takes 16ms to send. The test results also shows the amount of real send calls ENet is making.

I'm no networking expert so I'm sort of looking for some confirmation that these things are working correctly and efficiently or if there's more I can do before moving on with the rest of the project.

Thanks

This topic is closed to new replies.

Advertisement