FixIrreducibleControlFlow pass adds dispatch blocks with a br_table
that has multiple predecessors and successors, because it serves as
something like a traffic hub for BBs. As a result of this, there can be
register uses that are not dominated by a def in every path from the
entry block. For example, suppose register %a is defined in BB1 and used
in BB2, and there is a single path from BB1 and BB2:
BB1 -> ... -> BB2
After FixIrreducibleControlFlow runs, there can be a dispatch block
between these two BBs:
BB1 -> ... -> Dispatch -> ... -> BB2
And this dispatch block has multiple predecessors, now
there is a path to BB2 that does not first visit BB1, and in that path
%a is not dominated by a def anymore.
To fix this problem, we have been adding IMPLICIT_DEFs to all
registers in PrepareForLiveInternals pass, and then remove unnecessary
ones in OptimizeLiveIntervals pass after computing LiveIntervals. But
FixIrreducibleControlFlow pass itself ends up violating register use-def
relationship, resulting in invalid code. This was OK so far because
MIR verifier apparently didn't check this in validation. But @arsenm
fixed this and it caught this bug in validation
(https://github.com/llvm/llvm-project/issues/55249).
This CL moves the IMPLICIT_DEF adding routine from
PrepareForLiveInternals to FixIrreducibleControlFlow. We only run it
when FixIrreducibleControlFlow changes the code. And then
PrepareForLiveInternals doesn't do anything other than setting
TracksLiveness property, which is a prerequisite for running
LiveIntervals analysis, which is required by the next pass
OptimizeLiveIntervals.
But in our backend we don't seem to do anything that invalidates this up
until OptimizeLiveIntervals, and I'm not sure why we are calling
invalidateLiveness in ReplacePhysRegs pass, because what that pass
does is to replace physical registers with virtual ones 1-to-1. I
deleted the invalidateLiveness call there and we don't need to set
that flag explicitly, which obviates all the need for
PrepareForLiveInternals.
(By the way, This 'Liveness' here is different from LiveIntervals
analysis. Setting this only means BBs' live-in info is correct, all uses
are dominated by defs, kill flag is conservatively correct, which
means if there is a kill flag set it should be the last use. See
https://github.com/llvm/llvm-project/blob/2a0837aab1489c88efb03784e34c4dc9f2e28302/llvm/include/llvm/CodeGen/MachineFunction.h#L125-L134
for details.)
So this CL removes PrepareForLiveInternals pass altogether. Something
similar to this was attempted by D56091 long ago but that came short of
actually removing the pass, and I couldn't land it because
FixIrreducibleControlFlow violated use-def relationship, which this CL
fixes.
This doesn't change output in any meaningful way. All test changes
except irreducible-cfg.mir are register numbering.
Also this will likely to reduce compilation time, because we have been
adding IMPLICIT_DEF for all registers every time -O2 is given, but
now we do that only when there is irreducible control flow, which is
rare.
I'm not that familiar with LLVM, but I am guessing that this code will add an implicit def for each register? That is, this is a change over all the registers used in the function, as opposed to adding definitions in specific blocks for specific registers, and at the top of the function it writes a 0 to all of them?
If so, then I wonder if this has performance implications? Specifically I worry about a very large function that has a tiny irreducible part nested in it. Would this increase live ranges in the entire function?
Even if it does, maybe other passes are smart enough to not be limited by it? fwiw, after LLVM, in Binaryen this should not be a problem (adding more local.sets at the top of the function has no effect - we already assume all wasm locals have an initial set of 0).
If this is a potential problem in LLVM, then I think the minimal thing to do here is to define all registers that pass through the dispatch block, in the dispatch block.
(Apologies if I've misunderstood the code here...)