Deterministic Behavior

Sorry, no new binary release. The last one can be found here.Sorry, no new source release. The last one can be found here.

After a little over 80 minutes of running the 'game' on my desktop a lot of bouncing between balls (smileys) occurred. Actually I cheated a little by fast-forwarding it, I don't really like to wait 80 minutes for making a print screen. I also let the game run at a crawl just to make a print screen exactly at render state #30000. On this computer I turned VSync on so it rendered at a steady 60 FPS:

Now on my laptop I've done the same thing, but this time no VSync and just for the fun of it I also disabled partial thread synchronization.

Notice the difference? That's right, there is no difference :P. This is called deterministic behavior and it's a really nice feature to have when you want to implement network-support or replay-files. Determinism is only possible if you have a fixed update FPS, which makes it so complicated in variable FPS scenarios. Most physics engines only support determinism when the time-step is fixed! It is also possible to implement network-support and replay-files without determinism, but it will be a hell lot more difficult and a lot more bloaty.

Networking

Adding networked multiplayer to your game is difficult! One of the biggest problems we need to overcome is synchronization between the client and the server while not letting the client have a constant latency in everything he does. Lets say we have a simple 3D shooter with 2 players, playing it against each other over the internet. The following happens:

  • Player #1 pushes a button to  activate the bridge. The bridge is now extending, or at least on player #1's computer.
  • From the server's point of view player #2 has thrown a grenade towards player #1, which exploded just before player #1 wanted to press the button. Because of latency Player #1 did not yet received the message that he died :(.
  • A fraction of a second after Player #1 thought he pushed the button he explodes into a million pieces.
  • Player 1 respawns and the question we ask ourselves is: "On player #1's computer, is the bridge still extended?"

We have come to a situation where the game state on the client is no longer in sync with the one from the server (and the one from Player #2) which might lead to hilarious bugs :).  So lets look at a couple of solutions to fix this desynchronization.

  • The drawbridge will not extend immediately after pressing the button. The server decides if the drawbridge should extend and sends a message to the client "drawbridge is extending" to enforce synchronization.
    It is possible to let the server verify all actions  that the player makes, but this will result in an annoying perceivable latency. When throwing a grenade, do you we also wait till the server says "Grenade thrown at position X"? That would be so annoying when you're playing the game, we like an immediate response!
  • All objects in the game that can be modified while playing we call the "game state". We send this game state to the clients which will overwrite the game state that the client's think they have.
    That will work. We have an immediate response and possible errors in the game state will automatically be fixed when it is overwritten by a game state received from the server. For an FPS this will probably work, but what should we do for games that have huge game states? It would be unreasonable to send multiple MB's over the network for each update isn't it?
    • For games that have huge game states we could just split the game state in sections. Each area in the game has it's own partial game state so in case of an inconsistency we just need to send the partial game state of that area.
      Okay it's an ad-hoc solution, but it might work. So how does the server know what client-region have become invalidated?
      • Hmm maybe by keeping track of hashes of the game states, the client sends to the server "At frame #4322 my gamestate hashes are ..." and if the server detect that these hashes are wrong he'll send the appropriate partial game state.
        Nice one, but that will only work when the game is deterministic! If the game is non-deterministic you will expect that every state is a little bit different because things like objects in motion might have a wrong position if rendered at a different FPS!
    • We'll just send the game state of the area around the player, not the whole world. That should reduce the size quiet a bit!
      You are right that will probably reduce the state to reasonable sizes. What if the player is able to zoom out as far as he likes? (e.g. Supreme Commander zoom?)

A deterministic game makes networking so much better! It makes it possible to only push changes towards the client, because the client will handle object exactly the same as the server. It is still possible that a latency error occur (which can be easily detected in a deterministic game), but we only need to send a new partial game state to the client! Implementing something like this in a non-deterministic game will most certainly lead to all sort of exotic code and bulky network protocols, while we are able to create a relatively easy generic solution if it was deterministic. Pretty cool right?

Replay Files

This is almost too trivial to talk about. Okay let's keep it short.

For a deterministic game we only need to collect all user performed actions, because when these actions are replayed we are certain that the game looks exactly the same. Replay files from previous versions might not work anymore, but that isn't really a big deal.

For a non-deterministic game we need to store the game state at every step of the game (frequently enough to have smooth transitions). This might seem terrible, but it is not as awful as it seems. Each incremented frame the state has only changed a little bit (probably, depending on the type of game). By using smart compression techniques we are still able to reduce the replay-file to reasonable sizes. Generating these replay files will obviously cost a quiet a bit of computational power(especially for big game states) and to make this more efficient you probably need to write game-specific code to make this as fast as possible.

Some concerns

  • What about randomness? Let's say we have a chicken's AI: 20% of the time picks the ground, 80% of the time walks to a spot nearby. Randomness and determinism don't go well together, right? And randomness could be used in all sorts of procedures!
    True, but a random function on a computer is not actually random! If we set the seed to '0' and we start generating number we may get "3424, 43, 35953, 87, ...". I we reset the seed to 0 again and start generating numbers again it'll be "3424, 43, 35953, 87, ...". The same! So not truly random! Given that you use the same random function these numbers will even be the same across different platforms! If we make sure that the random-state is also synchronized then we can ensure a fixed order of pseudo-random numbers. We can ensure synchronization by letting the client execute the random function in the same order.
    • And multithreading? If we cannot guarantee the order in which the random numbers are collected everything may get messed up!
      For multithreading (or other forms of unordered processing) we need to nest random-states. If we cannot guarantee a certain order for processing the chicken  then the chicken needs to get his own random-object. This random object will be initialized with a certain seed given by its parent random-object.
  • What about floating-point operations? Math with floating points is not consistent across different type of CPU's
    On the same computer a certain floating-point operation will always have the same outcome. This is not the type of determinism we want, because the main advantage for being deterministic is having a very cool low-bandwidth network solution. Floating point operations will be problematic for determinism if we support networking across different platforms. So if we develop this engine for the XBOX 360 (a closed network environment), we don't have to worry about this. If we want to be more extravagant we'll have to make sure that our compiler flags are set up properly (this one), but I'm doubting whether that'll fix the problem. For C/C++ things may get a little more complicated (more architectures to support), as this beautiful written blog posts describes. Usually cross-platform floating-point errors get solved by applying fixed point math, which is certainly not a bad solution. Fixed point math can also be used in tons of optimization algorithms like culling, as this post describes, so it wouldn't be a waste to implement something like this. A reasonable sized RTS (or any other networked game with tons of position updates) without fixed point math is likely to have immense lags when played over the internet.
    • And what about floating point operations on a GPU?
      Ensuring equal results of floating-point operations on different GPU's is pretty much a lost cause. We don't really mind though, because we only use it for rendering. If we were to use GPU-based physics (e.g. nvidia's PhysX) than we do have a problem, because then we are no longer able to ensure determinism! So GPU based physics should only be used to increase eye-candy, not for actual game state updates.
    • What about the order of execution? With floating points A+B+C is not necessarily equal to A+C+B, even when fixed-point math is used.
      As with every calculation that changes the game-state we need to ensure order. So we can't blindly use Collection<T> or Dictionary<TKey, TValue>, instead we need to look at ordered lists or sorted dictionaries. The comparison function used may not act on pointers, but at least we don't have to worry about sorting algorithms being stable or not. For C++ we need to worry about different compilers deciding to change the order of execution and for C# we need to worry about the JIT who at run-time compiles code. The JIT in itself is non-determinsitic, but I don't know if this also influences the order of execution.
Multithreaded Renderloop - 2Multithreaded Renderloop - 3

line