Network
From Aleph One Wiki
Networking with Marathon
Return to History
By Woody Zenfell
In network play, one machine (initially the gatherer) is designated the "server". It runs NetServerTask (in a separately-scheduled task/thread) approximately 30 times a second. (Let's call this the "tick rate" for ease.) All other machines instead run NetQueueingTask (in a separately-scheduled task/thread) at the tick rate.
NetServerTask and NetQueueingTask accumulate action_flags (obtained from parse_keymap()) in the so-called "local_queue" - each machine in the game has exactly one, holding action_flags from the local user.
All machines run NetDDPPacketHandler (in a separately-scheduled task/thread) as quickly as possible upon receiving a network packet.
(In the discussion that follows, I will ascribe behaviors of functions called by these three major functions to the major functions themselves. For example, much work I claim is done by NetDDPPacketHandler will actually be accomplished by a sub-function like NetProcessIncomingBuffer.)
At any time, there is one ring packet carrying data around the ring. That is, the server sends a packet to its "upring" neighbor, who (in NetDDPPacketHandler) sends it to its upring neighbor, etc. until it works its way back around to the server. (Upring/downring often seems backwards to me, but if I think of uploading and downloading it helps.)
If the server has not accumulated any action_flags since it sent the ring packet (i.e. the ring took less than a game-tick to go around), NetDDPPacketHandler holds the packet until the next NetServerTask. NetServerTask then collects its one action_flag and sends the packet off. Otherwise, NetDDPPacketHandler functions much as it does on the non-servers - it takes flags from the local_queue and ships them off in the packet. In all cases, the server sets the timing by decreeing a certain number of required_action_flags for the new ring-cycle, based on how many flags have accumulated (and thus how much time has elapsed, assuming scheduling is doing its job).
Each non-server, on receiving a ring packet, moves its own flags from its local_queue into the network packet. If it has more action_flags than are required, it "ditches" the extras to avoid having stale input data taking up queue-space (and introducing lag in the player's movements). If it doesn't have enough action_flags, it "smears" to make up the difference. Smearing is merely calling parse_keymap() as many times as is necessary - in rapid succession - to obtain the needed flags. (IIRC ditching might be slightly more clever than this - it might, e.g., make sure it has extra data for at least two rings before ditching, so that transient lag bubbles don't result in unnecessary ditching and subsequent smearing.)
So ideally, if the ring takes N game-ticks to go around, then by the time it gets back to the server, the NetServerTask has accumulated N action_flags in the local_queue. It passes all N of these along and sets required_action_flags (which is communicated in the packet) equal to N. Assuming everyone's clock is in sync and there aren't any little bubbles in the network or in scheduling, every non-server has also accumulated N action_flags in its local_queue, thanks to NetQueueingTask. So ideally, everyone's local_queue has exactly required_action_flags in it when the packet arrives, and is completely emptied when the packet departs. (Bubbles and clock drift remove the "ideal" quality here, which is why ditching and smearing are needed to compensate.)
Two other things happen each time a ring packet comes in:
action_flags are copied from incoming ring packets to the RealActionQueues (each machine has a RealActionQueue for each player in the game) when a packet arrives, in NetDDPPacketHandler. Effectively, a machine overwrites the action_flags it sent the previous ring-cycle with its new set. Also note (importantly) that a given machine's action_flags don't get into the local player's RealActionQueue until the ring has been around once to collect the input, then around once more to deliver the flags.
An acknowledgement packet is sent downring (to the machine that sent the ring packet we received). Sequence numbers are used to detect duplicated packets. If a machine does not receive an acknowledgement within about 40ms of sending the ring packet, it tries resending in case the packet was lost. So in fact, if an ACK is lost somewhere, there could (briefly) be more than one ring packet on the network. But a machine receiving an outdated ring packet will know to merely ACK it - but not forward it, since it's already passed along one copy of the outgoing packet - so these replicate packets will die out rapidly, and won't confuse anyone. If too much time elapses (too many resends) without receiving an ACK, the machine decides its upring neighbor has died and reconstructs the ring without him. If the upring neighbor was the server, the machine waiting for ACKs names itself the new server, and play continues.
In a single-player game, none of the Big Three (NetServerTask, NetQueueingTask, or NetDDPPacketHandler) run; instead, input_controller runs (separately scheduled, IIRC) at the tick rate to capture input and queue it up.
(Actually hmm it seems that in the SDL version, input_controller is run (perhaps multiple times, if needed to catch up) in the main thread before the updates happen. I guess this is tantamount to smearing each time rendering takes more than one tick? no big deal, but it could perhaps reduce "accuracy", making the player's behavior "jumpy" in low-frame-rate environments.)
Finally, the main thread, which does game updating and rendering, watches the RealActionQueues. When it has one or more ticks' worth (let's say N flags) of data for each active player it runs N game-update operations to move all the players and monsters and platforms etc. around. Then, it renders a frame based on the resulting configuration.
Note that if the frame renders in less than one tick's time, the main thread ends up waiting for NetDDPPacketHandler or input_controller to provide data (which should happen, ideally, once per tick). This is the origin of Marathon's (and A1's) hard limit of approximately 30fps rendering. If a frame takes multiple ticks to render, it has a few ticks' data built up in the RealActionQueues, so update_world runs the various update routines multiple times to catch up before rendering the next frame.
There's one complication to the above. If a ring takes, fairly reliably, N ticks to go around, then each machine will get a delivery of N flags every N ticks. This means nobody will be able to render the game any more smoothly (at a higher frame rate) than one frame per N ticks (even if the machine is fast enough to render one frame every tick). So there's a mechanism in place (via NetGetNetTime) to limit the update loop's ordinarily voracious consumption of action_flags. The idea is that when we get a shipment of N flags, we update and render only the first tick's worth (assuming we can render in less than a tick's time - if not, the scheme still works correctly). The following real-time tick, the next tick's flag is used to update and render another tick, etc.... and when we run out of ticks, we should be due to receive a new shipment from the network. This means that although we have all the information to render N ticks into the future as soon as the packet arrives, we purposely delay that data up to an additional N-1 ticks to keep the rendering smooth - clearly a tradeoff. In Bungie's code, you choose how much to delay the data by choosing an item from a menu in Setup Network Game; I've implemented a mechanism that tries to adapt dynamically (NETWORK_ADAPTIVE_LATENCY_2).
It seems heartbeat_count, in single-player games and in film playback, serves a function similar to NetGetNetTime, but I have not analyzed it, and can't offhand think of any reason it would be needed, when it would seem that filling the RealActionQueue (or possibly queues, in playback) at the appropriate rate would automatically result in proper timing.
