ADD is an associative and commutative operation, so we can do reassociation for it.
Details
Diff Detail
Unit Tests
Event Timeline
llvm/test/CodeGen/X86/reassociate-add.ll | ||
---|---|---|
4 | No idea if we intended to not do reassociation in ADD instructions. |
llvm/test/CodeGen/X86/reassociate-add.ll | ||
---|---|---|
2 | Drop -mcpu=x86-64 Pre-commit these test with current codegen to trunk and rebase to show the patch diffs. | |
4 | Not that I can think of - EFLAGS will be the same and we already handles the $dst=$src0 constraint | |
llvm/test/CodeGen/X86/xmulo.ll | ||
1871 | Annoying - normally I'd expect reassociation to help us reduce stack use |
llvm/test/CodeGen/X86/reassociate-add.ll | ||
---|---|---|
4 | I was working on this a long time ago (~2015), so it's hard to remember exactly, but I don't think there was a fundamental reason to exclude integer ADD. It just seemed like it did not have much potential gain with the limited register set and could interfere with other transforms like LEA formation. If there's evidence that this improves something (and doesn't cause regressions), then it should be ok. |
llvm/test/CodeGen/X86/lea-opt-cse4.ll | ||
---|---|---|
105 | New leas, +1 |
llvm/test/CodeGen/X86/reassociate-add.ll | ||
---|---|---|
4 | EFLAGS may be different. Suppose we are adding three bytes, 250 + 10 + 10.
So we need to check if the definition of EFLAGS is dead. Maybe this is the reason @spatel didn't add ADD instructions in 2015. Thank @pengfei for the reminds. |
llvm/test/CodeGen/X86/reassociate-add.ll | ||
---|---|---|
4 | Scalar MUL would have the same EFLAGs issue. So hopefully we already solved it? |
llvm/test/CodeGen/X86/reassociate-add.ll | ||
---|---|---|
4 | Indeed. So IMULrr should be handled similar to ADD. But I can't find any code for it :( |
llvm/lib/Target/X86/X86InstrInfo.cpp | ||
---|---|---|
8695 | Here is the EFLAGS code |
llvm/lib/Target/X86/X86InstrInfo.cpp | ||
---|---|---|
8695 | Thanks for the pointer. So now EFLAGS is not a problem for ADD. |
llvm/test/CodeGen/X86/reassociate-add.ll | ||
---|---|---|
2 | Test case has been sent out as https://reviews.llvm.org/D136501. -mcpu=x86-64 is required because in MachineCombiner the reassociation is performed either the new sequence reduce latency, or function doSubstitute returns true. In its implementation, if -mcpu is not specified, the target will not have a valid schedule model, and then doSubstitute returns true, and reassociation is performed. bool MachineCombiner::doSubstitute(unsigned NewSize, unsigned OldSize, bool OptForSize) { if (OptForSize && (NewSize < OldSize)) return true; if (!TSchedModel.hasInstrSchedModelOrItineraries()) return true; return false; } | |
llvm/test/CodeGen/X86/xmulo.ll | ||
1871 | Agree it is annoying. In this specific case, although it increases stack use, it reduces register spill/reload instructions. $ git diff origin/main xmulo.ll | grep "+;" | wc 104 518 3715 $ git diff origin/main xmulo.ll | grep "\-;" | wc 108 542 3846 |
Hey, this broke a Dexter test: https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/47878/
@jmorse and/or @Orlando can you help make sure this isn't a Dexter issue?
Enjoyably not a Dexter bug -- the output with -v indicates that there's a sequence in the "main" function of that test where the main_result variable becomes unavailable with this patch. That would appear to be because TargetInstrInfo::reassociateOps isn't instrumented for instruction referencing, which I'll file as a fault. This patch just exposes that omission.
Sadly though, this Dexter test isn't supposed to be checking for this sort of error, it's supposed to be making sure the other functions in the file have reasonable debug-info. The only reason we see this is because we didn't mark main with __attribute__((optnone)). I'll add that attribute and un-xfail the test as I'm out of cycles; thanks for pinging!
It seems this patch cause spec2017/548 performance regression, as it increased register pressure after reassociating ADD operation.
Here is the EFLAGS code