- Floating point determinism is bloody hard, but not impossible, to achieve
- Syncing between x86 and x64 code is a nightmare
- SSE2 is pretty much everywhere, so configure your compiler to use it
- If you have to interact with the FPU, use /fp:precise (in Visual C++) and set the floating-point control bits to force 24-bit mantissas (i.e. IEEE single precision)
- Starting with non-deterministic code is a recipe for immense frustration. Before screwing with floating-point determinism issues, start with all your code running on the same machine with the same instruction set and make sure it's deterministic at the algorithm level before diving into the ugly parts
- Divide-and-conquer is essential. Narrow down on one area that's full of synchronization problems and fix it, then move on to other areas
- In the same vein, having a way to feed the same exact input into multiple runs of the code (across machines etc.) is so useful you don't want to live without it
- Beware compiler optimizations; you may spend a lot of time rooting around in icky FPU or SSE2 code figuring out why exactly things differ across builds, for instance. Know how to suppress (or encourage!) your compiler's optimizations selectively so you can get better consistency
- Work from the bottom layers upwards. Get consistent simulation results for, say, your physics before worrying about higher-level logic bugs, especially latency-related issues. Nothing sucks like banging your head on what seems to be a logic bug when it's just your physics being snarky
Might update with more as time goes by, or again as above, I might bother to actually write up the reasoning behind all this. For now I'll just dump it here and see if anyone finds it interesting :-)