This is an archive of the discontinued LLVM Phabricator instance.

GlobalISel: Try to combine G_[SU]DIV and G_[SU]REM
ClosedPublic

Authored by cdevadas on Feb 4 2021, 3:42 AM.

Details

Summary

It is good to have a combined divrem instruction when the
div and rem are computed from identical input operands.
Some targets can lower them through a single expansion that
computes both division and remainder. It effectively reduces
the number of instructions than individually expanding them.

Diff Detail

Event Timeline

cdevadas created this revision.Feb 4 2021, 3:42 AM
cdevadas requested review of this revision.Feb 4 2021, 3:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2021, 3:42 AM
arsenm added a comment.Feb 4 2021, 5:57 AM

The legalizer should also get the code to split this back into the separate opcodes

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
944

This needs to have some kind of legalization check. We should only do this if either the combined divrem is legal or legalizable, or before legaliation

946–948

This can probably be an assert, the opcode should come from the tablegen opcode list

974

Should not construct single use MIRBuilders. There should be one available in the combiner state

llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp
72–76

Should be called by the generic tablegen code in include/llvm/Target/GlobalISel/Combine.td

llvm/test/CodeGen/AMDGPU/GlobalISel/prelegalizer-combiner-divrem.mir
4

Should add some vector tests. Also cases with other uses, and negative cases with mismatched signed/unsigned pairs

paquette added inline comments.Feb 8 2021, 11:38 AM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
946

Maybe something like this?

switch(Opcode) {
  default:
     llvm_unreachable("Unexpected opcode!");
  case TargetOpcode::G_SDIV:
  case TargetOpcode::G_UDIV:
      IsSigned = Opcode == TargetOpcode::G_SDIV;
      IsDiv = true;
      break;
  case TargetOpcode::G_SREM:
  case TargetOpcode::G_UREM:
     ...
}
955

Why not use Opcode? If it's a div, then Opcode == DivOpcode. If it's a rem, then Opcode == RemOpcode.

977

This part should go in a MatchInfo parameter like in the other combines. That should be passed along to an apply function which should perform the actual combine.

984

I think this should go in an applyCombineDivRem function like the other combines.

I am making it a generic pattern that comes from the Combine.td. Will update the patch soon.

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
955

Based on the signness of MI, here I am trying to get the other opcode to match.
And the check below matches the (sdiv, srem) or (udiv, urem) pair irrespective of their position.

cdevadas updated this revision to Diff 322343.Feb 9 2021, 4:32 AM

Addressed the review comments:

Made the combiner pattern generic.
Added the necessary legalization check.
Legalized the G_[SU]DIVREM instruction.
Added vector tests and some negative tests.
Added two new tests for post-legalizer and legalizer-helper.
arsenm added inline comments.Feb 9 2021, 8:59 AM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
1006–1007

Should just use the regular getVRegDef

1026

I'm worried about what happens if multiple div/rem users exist. Can you add some tests for that?

Can you add an entry for G_SDIVREM and G_UDIVREM to docs/GlobalISel/GenericOpcode.rst?

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
6277 ↗(On Diff #322343)

Can the Legalizer changes come in a follow-up patch?

I will make an entry for the new opcodes in docs/GlobalISel/GenericOpcode.rst

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
1026

This patch currently performs a one-to-one match. It doesn't look for multiple div/rem users.
For example, it won't optimally match the following pattern:
G_SDIV x, y, G_SREM x, y and an extra G_SDIV x, y
The first div/rem pair will be combined and the second div will be retained.
Instead, it should have generated d, r = G_SDIVREM x, y ; replace all users of the second G_SDIV with 'd' and then delete the extra G_SDIV as well.
I will rework this part.

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
6277 ↗(On Diff #322343)

I thought about it initially. But it makes more sense to legalize the operations along with the patch that introduces the opcodes.

cdevadas updated this revision to Diff 322909.Feb 10 2021, 11:50 PM

Improved the matcher function to identify a div/rem pair and any number of matching div and/or rem instructions.
Documented the new instructions in docs/GlobalISel/GenericOpcode.rst

arsenm added inline comments.Feb 12 2021, 3:00 PM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
1026

I don't think this should be collecting a vector of other instructions and looking for redundant div/rems. It just needs to not crash if one of these extras exist

cdevadas updated this revision to Diff 323876.Feb 15 2021, 9:07 PM

Reverting the recent changes to combine any matching extra div and rem instructions alongside the divrem instruction. The presence of the extra instruction shouldn't crash. Added tests to ensure that.

cdevadas updated this revision to Diff 324267.Feb 17 2021, 4:11 AM

Used MachineInstr as the MatchInfo instead of the Register.
With this, the other matching MI will be directly available in the apply function.

arsenm added inline comments.Feb 17 2021, 12:59 PM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
1019

I still suspect something is wrong with the insertion. I don't believe it's guaranteed that the first instruction you find will dominate the second, so you don't know where to place the insert point.

llvm/test/CodeGen/AMDGPU/GlobalISel/prelegalizer-combiner-divrem.mir
382–383

You can shorten the test by just copying tuple registers, we just lower arguments with the separate copies

cdevadas added inline comments.Feb 17 2021, 5:16 PM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
1019

The instructions are visited in the top-down order and MI will always dominate the otherMI.

cdevadas updated this revision to Diff 326727.Feb 26 2021, 10:05 AM

Addressed the review comments:
Used the dominance check to find the Insertion point.
Added a negative test when the dominance check fails to identify the insertion point.
Simplified the vector tests.

arsenm added inline comments.Feb 26 2021, 4:23 PM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
1019

The combiner visits bottom up.

Actually if you check that these are in the same block, you can just use OtherMI as the insert point

1025

I'd rather keep all legality checks in the match function. But also I realized it should work if you check if the parent blocks are the same and then you can assume otherMI is earlier

The div/rem pair coming from different BB will have many scenarios that can complicate the match/apply functions and it is beyond the scope of the Combiner.
I will limit the pattern to match it for the pair coming from the same BB.

cdevadas updated this revision to Diff 328047.Mar 4 2021, 12:52 AM

Match the div & rem only when they are from the same Basic Block.

arsenm accepted this revision.Mar 8 2021, 11:37 AM
This revision is now accepted and ready to land.Mar 8 2021, 11:37 AM
This revision was automatically updated to reflect the committed changes.