32-bit x86 Windows targets use a linked-list of nodes allocated on the
stack, referenced to via thread-local storage. The personality routine
interprets one of the fields in the node as a 'state number' which
indicates where the personality routine should transfer control.
State transitions are possible only before call-sites which may throw
exceptions. Our previous scheme had us update the state number before
all call-sites which may throw.
Instead, we can try to minimize the number of times we need to store by
reasoning about the nearest store which dominates the current call-site.
If the last store agrees with the current call-site, then we know that
the state-update is redundant and can be elided.
This is largely straightforward: a RPO walk of the blocks allows us to
correctly forward propagate the information when the function is a DAG.
Currently, loops are not handled optimally and may trigger superfluous
state stores.
I'm not really a fan of all these long inline lambdas. I'd rather see this out of line, despite the paramter passing boilerplate:
You can practically guess the algorithm it's implementing from that prototype.