This gets rid of duplicated code and diverging behaviour w.r.t.
constants.
Fixes PR45086.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Note that most test changes are NFC (just interleaving computations). I've highlighted the actual changes.
llvm/test/CodeGen/X86/memcmp.ll | ||
---|---|---|
103 | This is the real change. |
llvm/test/CodeGen/PowerPC/memCmpUsedInZeroEqualityComparison.ll | ||
---|---|---|
93 | These are also real changes. |
Given memcmp expansion is running late in the pipeline, do you think we're possibly missing other relevant optimizations on the loads? LICM, maybe?
llvm/test/CodeGen/X86/memcmp.ll | ||
---|---|---|
425 | The scheduling here seems to be worse? |
It's definitely missing other optimizations. I tried moving it sooner in the pipeline some time ago for this reason (https://reviews.llvm.org/D60318), but it turned out to be much harder than I anticipated because of interactions with the sanitizers.
llvm/test/CodeGen/X86/memcmp.ll | ||
---|---|---|
425 | I think we could argue either way: the new scheduling interleaves loads and other computations, evening out the port pressure and increased compute parallelism. On the other hand it's true that there is less data parallelism. It should not matter with recent out-of-order cores anyway. |
llvm/test/CodeGen/X86/memcmp.ll | ||
---|---|---|
425 | I'd prefer to avoid unrelated changes.
Scheduling an arithmetic operation that uses a loaded value immediately after the load is never going to work out; loads have latency. Granted, I agree it's unlikely to matter much on a modern x86 core. Really, I'm more surprised we aren't trying to do any scheduling at all after isel. |
For reference, we still have at least a couple of motivating bugs for moving this up in the pipeline:
https://bugs.llvm.org/show_bug.cgi?id=36421
https://bugs.llvm.org/show_bug.cgi?id=34032#c13 - reduced example: https://godbolt.org/g/QjVXvS
I think this patch makes sense, but we might also consider a small add-on that would catch the cases shown here and possibly more. We can basically run -instsimplify within a pass via:
diff --git a/llvm/lib/CodeGen/ExpandMemCmp.cpp b/llvm/lib/CodeGen/ExpandMemCmp.cpp index d0dd538f1f5..bb5e6261593 100644 --- a/llvm/lib/CodeGen/ExpandMemCmp.cpp +++ b/llvm/lib/CodeGen/ExpandMemCmp.cpp @@ -23,6 +23,7 @@ #include "llvm/CodeGen/TargetSubtargetInfo.h" #include "llvm/IR/IRBuilder.h" #include "llvm/InitializePasses.h" +#include "llvm/Transforms/Utils/Local.h" #include "llvm/Transforms/Utils/SizeOpts.h" using namespace llvm; @@ -869,6 +870,11 @@ PreservedAnalyses ExpandMemCmpPass::runImpl( ++BBIt; } } + + if (MadeChanges) + for (BasicBlock &BB : F) + SimplifyInstructionsInBlock(&BB); + return MadeChanges ? PreservedAnalyses::none() : PreservedAnalyses::all(); }
LGTM, but as a matter of risk reduction, please make that an independent commit.
I don't think it would be noticeable, but there is some non-zero compile-time cost in doing that analysis/transform (although it might pay for itself if it makes later codegen faster).
I don't have a preference for order of landing the changes, but if I'm seeing it correctly, if we land the simplify commit later, it would only cause benign (different label names) diffs in the asm tests.
Thanks.
Will do.
I don't think it would be noticeable, but there is some non-zero compile-time cost in doing that analysis/transform (although it might pay for itself if it makes later codegen faster).
I don't have a preference for order of landing the changes, but if I'm seeing it correctly, if we land the simplify commit later, it would only cause benign (different label names) diffs in the asm tests.
There is actually a diff in the powerpc test that compares two constants.
These are also real changes.