Machine combiner supports generic reassociation only of associative and
commutative instructions, for example (A + X) + Y => (X + Y) + A. However, we
can extend this generic support to handle patterns like
(X + A) - Y => (X - Y) + A), where - is the inverse of +.
This patch adds interface functions to process reassociation patterns of
associative/commutative instructions and their inverse variants with minimal
changes in backends.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
Some support of add-sub reassociation was added to AArch64 in D124564. This patch generalizes this idea and makes implementation target-independent.
It also introduces one more invariant to machine combiner that is REASSOC_AX/XA_BY/YB patterns must be gathered only with standard mechanism (isAssociativeAndCommutative function, etc. ) if reassociation is performed by default implementation. This sounds reasonable for me: if we have custom pattern matcher, we must be ready to write custom reassociation. All targets except RISCV that use machine-combine do not contradict the new invariant. If this patch gets positive feedback, I'll change RISCV implementation accordingly. It will require a few changes.
There is a big diff for tests, but it is just a change in the order of operands in commutative instructions. This is required to generalize the transformation rules.
Do we have any benefits in doing this? The diff in tests just show a swap of 2 source operands, which I think might have no impact to the performance.
This is a general support in machine-combiner that provides functions to be overridden in backends. So, exactly this patch will not show performance changes. But I agree, I definitely must've demonstrated at least some performance measurements. So, if we implement basic support in RISCV and say that fsub is the inverse of fadd, then for Whetstone we have the following results:
N1 +8%
N2 +14%
Baseline
Loop content Result MFLOPS MOPS Seconds N1 floating point -1.12398255667391900 284.545 0.694 N2 floating point -1.12187079889295083 224.314 6.162 N3 if then else 1.00000000000000000 5715.791 0.186 N4 fixed point 12.00000000000000000 323977497.600 0.000 N5 sin,cos etc. 0.49902937281518078 20.796 41.147 N6 floating point 0.99999987890802811 169.611 32.708 N7 assignments 3.00000000000000000 7136.783 0.266 N8 exp,sqrt etc. 0.75100163018453681 21.396 17.882 MWIPS 1038.403 99.046
This patch + RISCV support
Loop content Result MFLOPS MOPS Seconds N1 floating point -1.12398255667392588 308.002 0.652 N2 floating point -1.12187079889289487 257.436 5.460 N3 if then else 1.00000000000000000 5714.154 0.189 N4 fixed point 12.00000000000000000 299507735.273 0.000 N5 sin,cos etc. 0.49902937281518078 20.861 41.714 N6 floating point 0.99999987890802811 170.068 33.173 N7 assignments 3.00000000000000000 7176.013 0.269 N8 exp,sqrt etc. 0.75100163018453681 21.117 18.425 MWIPS 1047.136 99.882
Compilation flags -O3 -funroll-loops -finline-functions -ffast-math -mtune=sifive-u74
llvm/include/llvm/CodeGen/TargetInstrInfo.h | ||
---|---|---|
1179 | I'm not sure we should mention mul/div here. I don't think you can reassociate them the same way. | |
1181 | Can we make getInverseOpcode return an Optional<unsigned> so we can merge hasInverseOpcode and getInverseOpcode? | |
1193 | Can we not figure this out using getInverseOpcode? | |
llvm/lib/CodeGen/TargetInstrInfo.cpp | ||
835 | Incorrectly* |
llvm/include/llvm/CodeGen/TargetInstrInfo.h | ||
---|---|---|
1179 | Do you mean division by zero cases, e.g. (A / X) * 0 --> A / (X / 0) ? We must not introduce new divisors and then it will be legal. Am I missing something? | |
1181 | Yeah, I'll do that. My original implementation used Optional, but then I decided that two functions were clearer solution for the interface. However, now I see that having Optional is better than these ugly reminders in the comments. | |
1193 | Do you suggest changing isAssociativeAndCommutative to take not only an instruction, but an opcode as well? This seems to spoil the readability of its interface. However, this will simplify the implementation. | |
llvm/lib/CodeGen/TargetInstrInfo.cpp | ||
857 | This is incorrect. Must be Y - (A - X) => (Y + X) - A |
@craig.topper , may I ask you to take a look at the questions from my previous comment (https://reviews.llvm.org/D136754#3898721) ? It'll help me to resolve original comments properly.
Address review comments:
- fix typos
- merge hasInverseOpcode and getInverseOpcode
- drop mentions of mul/div reassociation
llvm/include/llvm/CodeGen/TargetInstrInfo.h | ||
---|---|---|
1179 | Disregard my previous comment. I agree we should not mention mul/div here as we cannot guarantee transformation safety. | |
1181 | Thanks, done. | |
1193 | I left this unchanged for now and uploaded a patch that shows how it looks like in the current design D138660 . If there is still a desire to merge isInverseInstAssociativeAndCommutative with isAssociativeAndCommutative, then we should change the latter to take not only the instruction, but the opcode as well /// Return true when \P Inst with \P Opcode is both associative and commutative. virtual bool isAssociativeAndCommutative(const MachineInstr &Inst, unsinged Opcode) const; which seems to me pretty unclear from the user's point of view of this interface. |
llvm/include/llvm/CodeGen/TargetInstrInfo.h | ||
---|---|---|
1193 | Nevermind. My suggestion. I realize now that it requires the instruction to exist with the inverted opcode. Would it make sense to pass an Invert bool to isAssociativeAndCommutative so we don't need to two interfaces? |
llvm/include/llvm/CodeGen/TargetInstrInfo.h | ||
---|---|---|
1193 | Good suggestion, thanks! I've added Invert bool flag. One thing I'd like to mention explicitly is that I've added the default value to Invert argument. In general, it's dangerous in C++ to combine virtual functions and default argument values. However, I can't imagine the situation when a specific target decides to use another default value for the argument as it'll break machine combiner logic. So, exactly in this case I think it's completely safe. |
I'm not sure we should mention mul/div here. I don't think you can reassociate them the same way.