Hot Runtime Linking

This is an idea I’ve had for a while. Iteration times for native code are not improving much even though we have lots of CPU horsepower for compilation. The number one offender is link times which are as bad and non-parallelizable as ever.

So image that you didn’t have to wait for linking after changing one or more C or C++ files. Just recompile the individual object files and the game (or other application with an update loop) will pick up your changes, possibly without even restarting.

I think this could be achieved by replacing the link step with a system composed of two parts:

  1. A linker daemon (the host)
  2. A target process that loads code+data over the network from the host and then accepts updates from the daemon (the target)

The host process manages the address space of the target process. It keeps the target program’s object code and data in memory and can resolve relocations internally. When all symbol references are satisfied it will push the required changes to the address space of the target over a socket and tell it where to jump. The target process periodically checks for updates and “parks” its execution in a safe point where it can receive these updates.

Imagine a scenario with a program consisting of 100 object files. Here is a chain of events describing a programmer working on changes to a subsystem in a few of those object files:

  1. The programmer makes a build (but doesn’t link the program)
  2. The programmer starts the host, telling it to load all the objects. The host resolves symbols and prepares a memory image of the target program, much like a regular linker would, with the key difference that everything is kept in memory.
  3. The programmer starts the target stub, telling it to connect to the host.
  4. The target stub downloads a complete memory image and starts running a loop, calling the main entry point.
  5. The programmer changes a few source files and recompiles only those files.
  6. The host picks up on the object file changes (maybe through an explicit signal from the programmer)
  7. The host relinks the target image in memory, making sure symbols are resolved.
  8. The host synchronizes with the target to make sure it is safe to rearrange its memory image.
  9. The host garbage collects stale code and data that is no longer referenced, possibly by reaching out and scanning the target process memory to make sure there are no dynamically stored pointers into those blocks in a conservative fashion. If so, a warning is printed and the blocks are retained. They can be released on a full restart of the program.
  10. The host transmits the required changes in code and data to the target’s memory.
  11. The target resumes calling the main update function.
  12. The target runs the new code. Repeat to step #5
This would be a pretty cool workflow setup completely free of link time stalls. When the time comes to prepare a final image, just run the linker as normal.
Here are a few issues that would have to be handled carefully:
  • Function pointers stored in the target memory would point to old versions of functions if they are updated. While we can scan for them and not remove the old code if it is referenced, it would be confusing to the programmer. The target layer can provide a callback to the user program that lets it adjust function pointers as an opt-in API.
  • Relocation of code to avoid fragmenting the target memory too much. The host could compact most of the target’s memory space safely while it is suspended in the safe update point and update all relocations accordingly with a patch list. It would avoid moving functions and data that have pointers (pinning them) like described above.
  • Certain C++ features like static constructors would not work well without additional support.
One way to mitigate all these issues would be to restart the target program every time and treat it more as a network loader than a continuing service, but that would require the user program to restart as well. It would be cooler to have it update in place when there is a periodic update function like all games have.Β Let me know what you think. Is it worth building? What are some other issues that would have to be addressed?
Advertisements

6 thoughts on “Hot Runtime Linking

  1. Funny, I’ve been thinking about deferred linking and similar stuff lately too…

    Seems like one of the bigger issues might be getting the debugger into the loop?

    • Yeah “perfect” debugging will be a real challenge. It might work to create a custom debugging engine for Visual Studio or adapt GDB, but I guess debuggers will cache information about symbols locally to some extent and not really handle stuff moving around behind their backs.

  2. That would be an awesome workflow indeed. However, fixing up function pointers, or pointers period, seems like it would be difficult, if even possible (with all the ugly casting that goes around your typical C/C++ code base).

    What you described does remind me of something I read about Naughty Dog’s goal, where a new compiled function is uploaded to the target on its own, rebound the next frame, and the old unbound functions end up getting GC’d.

    This makes me wonder if our desires aren’t implicitly indicating a need for a different language.

    Having said this, it is possible to split a code base into mutliple reloadable modules (DLLs, PRXs, etc) and formally reload them (although the main app’s code has to explicitly refresh all bindings between the app and the reloaded module) , but it isn’t anywhere near the magic of updating a single function and having it automatically uploaded & used on the target, and it’s a big architectural effort. But as far as it is from your vision, it has been used with success and does save iteration time.

    I guess a decent alternative would be to have a code base that builds fast, period! πŸ™‚ I get it, that’s not the same as being able to live iterate and preserve state on code updates. πŸ™‚

    To re-iterate, sometimes I really wonder if we are writing things in the right language for 70% of our game code bases. πŸ™‚

    • Totally agree, it would make much sense to design a language around this development model–in fact Lisp is already that language. But seeing how hard it is to change people’s habits and the astronomical cost of throwing away all existing code that’s a luxury only startups or hobby projects can afford in practice.

      DLLs and PRXses can definitely work, but it’s sooo much boilerplate for a development-only feature, and you have to set the granularity on an arbitrary basis. That’s why I think something like this that does it wholesale would be easier to actually use for any random game project.

      • I don’t think the boilerplate overhead for hot-swappable dlls is a problem. Care has to be taken for pointers across module boundaries, references to static data, etc, but it is definitely maintainable. Debugging also works fine for reloaded dlls.

        The granularity is the big problem. It’s hard to find good, reasonably large candidates with a not too big API that actually can benefit from hot-swapping.

        Once that is in place though, the thing that slows down iterations is not the linker, but the compiler. For the hot-swappable dll I’m currently working with (you know which project πŸ˜‰ ), a small change in one file requires a 10 second compile and a 1.5 second link.

        Agree that it would be cool to have hot-swapping on an entire game-project though.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s