Page MenuHomePhabricator

[MachineCombiner] Extend reassociation logic to handle inverse instructions

Authored by asi-sc on Oct 26 2022, 3:28 AM.



Machine combiner supports generic reassociation only of associative and
commutative instructions, for example (A + X) + Y => (X + Y) + A. However, we
can extend this generic support to handle patterns like
(X + A) - Y => (X - Y) + A), where - is the inverse of +.
This patch adds interface functions to process reassociation patterns of
associative/commutative instructions and their inverse variants with minimal
changes in backends.

Diff Detail

Event Timeline

asi-sc created this revision.Oct 26 2022, 3:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2022, 3:28 AM
asi-sc requested review of this revision.Oct 26 2022, 3:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2022, 3:28 AM

Some support of add-sub reassociation was added to AArch64 in D124564. This patch generalizes this idea and makes implementation target-independent.

It also introduces one more invariant to machine combiner that is REASSOC_AX/XA_BY/YB patterns must be gathered only with standard mechanism (isAssociativeAndCommutative function, etc. ) if reassociation is performed by default implementation. This sounds reasonable for me: if we have custom pattern matcher, we must be ready to write custom reassociation. All targets except RISCV that use machine-combine do not contradict the new invariant. If this patch gets positive feedback, I'll change RISCV implementation accordingly. It will require a few changes.

There is a big diff for tests, but it is just a change in the order of operands in commutative instructions. This is required to generalize the transformation rules.

Do we have any benefits in doing this? The diff in tests just show a swap of 2 source operands, which I think might have no impact to the performance.

Do we have any benefits in doing this? The diff in tests just show a swap of 2 source operands, which I think might have no impact to the performance.

This is a general support in machine-combiner that provides functions to be overridden in backends. So, exactly this patch will not show performance changes. But I agree, I definitely must've demonstrated at least some performance measurements. So, if we implement basic support in RISCV and say that fsub is the inverse of fadd, then for Whetstone we have the following results:

N1 +8%
N2 +14%


Loop content                  Result              MFLOPS      MOPS   Seconds

N1 floating point     -1.12398255667391900       284.545              0.694
N2 floating point     -1.12187079889295083       224.314              6.162
N3 if then else        1.00000000000000000                5715.791    0.186
N4 fixed point        12.00000000000000000               323977497.600    0.000
N5 sin,cos etc.        0.49902937281518078                  20.796   41.147
N6 floating point      0.99999987890802811       169.611             32.708
N7 assignments         3.00000000000000000                7136.783    0.266
N8 exp,sqrt etc.       0.75100163018453681                  21.396   17.882

MWIPS                                           1038.403             99.046

This patch + RISCV support

Loop content                  Result              MFLOPS      MOPS   Seconds

N1 floating point     -1.12398255667392588       308.002              0.652
N2 floating point     -1.12187079889289487       257.436              5.460
N3 if then else        1.00000000000000000                5714.154    0.189
N4 fixed point        12.00000000000000000               299507735.273    0.000
N5 sin,cos etc.        0.49902937281518078                  20.861   41.714
N6 floating point      0.99999987890802811       170.068             33.173
N7 assignments         3.00000000000000000                7176.013    0.269
N8 exp,sqrt etc.       0.75100163018453681                  21.117   18.425

MWIPS                                           1047.136             99.882

Compilation flags -O3 -funroll-loops -finline-functions -ffast-math -mtune=sifive-u74

craig.topper added inline comments.Oct 31 2022, 10:11 PM

I'm not sure we should mention mul/div here. I don't think you can reassociate them the same way.


Can we make getInverseOpcode return an Optional<unsigned> so we can merge hasInverseOpcode and getInverseOpcode?


Can we not figure this out using getInverseOpcode?



asi-sc added inline comments.Nov 1 2022, 4:06 AM

Do you mean division by zero cases, e.g. (A / X) * 0 --> A / (X / 0) ? We must not introduce new divisors and then it will be legal. Am I missing something?
If we don't have any other problems with mul/div, I'll fix reassociation patterns.


Yeah, I'll do that. My original implementation used Optional, but then I decided that two functions were clearer solution for the interface. However, now I see that having Optional is better than these ugly reminders in the comments.


Do you suggest changing isAssociativeAndCommutative to take not only an instruction, but an opcode as well? This seems to spoil the readability of its interface. However, this will simplify the implementation.


This is incorrect. Must be Y - (A - X) => (Y + X) - A

@craig.topper , may I ask you to take a look at the questions from my previous comment ( ? It'll help me to resolve original comments properly.

asi-sc updated this revision to Diff 477749.Nov 24 2022, 4:44 AM

Address review comments:

  • fix typos
  • merge hasInverseOpcode and getInverseOpcode
  • drop mentions of mul/div reassociation
asi-sc added inline comments.Nov 24 2022, 5:08 AM

Disregard my previous comment. I agree we should not mention mul/div here as we cannot guarantee transformation safety.


Thanks, done.


I left this unchanged for now and uploaded a patch that shows how it looks like in the current design D138660 . If there is still a desire to merge isInverseInstAssociativeAndCommutative with isAssociativeAndCommutative, then we should change the latter to take not only the instruction, but the opcode as well

/// Return true when \P Inst with \P Opcode is both associative and commutative.
virtual bool isAssociativeAndCommutative(const MachineInstr &Inst, unsinged Opcode) const;

which seems to me pretty unclear from the user's point of view of this interface.

craig.topper added inline comments.Nov 24 2022, 10:35 AM

Nevermind. My suggestion. I realize now that it requires the instruction to exist with the inverted opcode.

Would it make sense to pass an Invert bool to isAssociativeAndCommutative so we don't need to two interfaces?

asi-sc updated this revision to Diff 477892.Nov 25 2022, 2:03 AM

Address comments: pass bool flag to isAssociativeAndCommutative

asi-sc added inline comments.Nov 25 2022, 2:13 AM

Good suggestion, thanks! I've added Invert bool flag. One thing I'd like to mention explicitly is that I've added the default value to Invert argument. In general, it's dangerous in C++ to combine virtual functions and default argument values. However, I can't imagine the situation when a specific target decides to use another default value for the argument as it'll break machine combiner logic. So, exactly in this case I think it's completely safe.

asi-sc updated this revision to Diff 479645.Dec 2 2022, 8:02 AM

Change Optional to std::optional, rebase

craig.topper added inline comments.Dec 5 2022, 3:58 PM



Add /*Invert*/ before true


Add /*Invert*/ before true

asi-sc updated this revision to Diff 480377.Dec 6 2022, 1:09 AM

Address review comments

This revision is now accepted and ready to land.Dec 6 2022, 3:07 PM