When checking the profitability of folding an address computation
into a memory instruction, the compiler tries to determine the liveness
of the values, comprising the address, at the point of the memory instruction.
This patch improves on the live variable estimates by including
the loop invariants which are references in the loop body.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This discovers unstable behavior.
For me, MCA/ResourceManaget.o sometimes differs.
Reproduced by stage1 clang.
Something sensitive might be in LoopInfo.
llvm/lib/CodeGen/CodeGenPrepare.cpp | ||
---|---|---|
5106 | L is unstable. |
Same here, I also bisected to this commit. I found it separately via CodeGen/PrologEpilogInserter.o which differs ~10-20% of the time. The minimal set of flags to reproduce it with (for me at least) is -O3 -fno-exceptions.
@chill, are you able to reproduce this?
I was reducing this last night so that I could give you a .ll test case that exhibits this behavior, but that was derailed by something and I have to start over. If you can already reproduce it yourself, I'll just stop the process.
No, I don't have anything (I've not done anything for this problem either, I'm busy elsewhere).
Any input/data would be greatly appreciated!
I'm not making very good progress with a test case. So far the only interesting insights I may have:
- This only reproduces in optimized builds. Setting -UNDEBUG makes the problem go away. Maybe debug builds are normalizing something?
- The problem obvious goes away with -mllvm -cgp-max-loop-inv-users-to-scan=0. One of my guesses was that V->uses() ordering might be non-deterministic, so you could get different answers in isUsedInLoop if the loop is terminated before you visit the instruction that would cause you to return true. That is not the case: the problem still exists with -mllvm -cgp-max-loop-inv-users-to-scan=10000.
Otherwise, if I do the usual process of getting clang to dump the IR and pipe that into opt, or various things like that, the process is deterministic. If you have any guess for what might get a reproducer, I can give it a shot.
All the operations involved here should be deterministic. You're probably looking for a bug elsewhere. One possibility is that LoopInfo doesn't contain the correct information... for example, the CFG changes, but that change isn't recorded in LoopInfo.
Ack. To clarify, I'm fairly confident this patch _causes_ non-determinism, as I can bisect to this commit and reproduce it, and not at the commit prior to it, etc. But of course this might be doing a valid, deterministic transformation that triggers an existing non-determinism bug later on, or something like that.
Reproduced with a stage2 compiler when compiling llvm/lib/MCA/HardwareUnits/ResourceManager.cpp.
-> loop invariant