Prior to this change, LLVM would in some cases emit *massive* writeout
functions with many 10s of 1000s of function calls in straight-line
code. This is a very wasteful way to represent what are fundamentally
loops and creates a number of scalability issues. Among other things,
register allocating these calls is extremely expensive. While D46127 makes this
less severe, we'll still run into scaling issues with this eventually. If not
in the compile time, just from the code size.
Now the pass builds up global data structures modeling the inputs to
these functions, and simply loops over the data structures calling the
relevant functions with those values. This ensures that the code size is
a fixed and only data size grows with larger amounts of coverage data.
A trivial change to IRBuilder is included to make it easier to build
the constants that make up the global data.
Maybe worth using 'zip' here to iterate the CUNodes (it does support iterators rather than using getOperand(i))? Might be easier, I guess, if llvm::seq supported more defaults (like defaulting to int, starting at zero, and unbounded (no need to specify an end value if the thing you zip it with is constrained anyway))