The AssemblyWriter constructor is in the critical path of
Value::operator<<, which in turn is in the critical path of most debug
messages. It currently scans the entire module multiple times, even
though most of that work is usually unnecessary. This change makes us
lazier.
With this change, I can compile a big Thrust file with
-debug-only=inline in a few minutes. Before, I gave up after 30m.
Man, by last time I looked at libstdc++, it's like 48 bytes on 64-bit machines...
Even if Lazy keeps track of the raw lambda, the lambda may still capture stuff and have memory footprint.
Now I think it's a bad idea. It's as bad as that dtor doesn't have parameters so each container, which may be in a parent container, needs to keep its allocator.
Sorry, I think the Optional version is better. :(
BTW, how hard is it to make the user side lazy, rather the library side?