Anyone ever try migrating a game server from one machine to another, to get more or less resources depending on load? AWS offers live migration for both Xen and KVM VMs. It's sometimes used when a multithreaded web server needs more threads.
Live migration uses a clever trick. First, all memory pages are marked as dirty (un-copied) by the hypervisor. Dirty pages then start to be copied, over the local network, from source to target machine. It's a lot like writing pages out to virtual memory on disk. Meanwhile, the program is still running, creating more dirty pages. There's thus a race between the copier, making copies of clean pages, and the program, making pages dirty. The endgame is freezing the program, copying the remaining pages, and starting up the version on the target machine. All networking and file connections are of course shifted over. The freeze period may be as little as a few hundred milliseconds.
This is a standard AWS offering, not some experimental thing. It's been in use for years.
The question is, has anyone tried doing this underneath a running game server? And, if they have, how long was the “freeze” period? Did it work out?
I'm thinking about this as a strategy for big-world systems which need to apply more resources to areas with more load.