It is good to have a combined divrem instruction when the
div and rem are computed from identical input operands.
Some targets can lower them through a single expansion that
computes both division and remainder. It effectively reduces
the number of instructions than individually expanding them.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The legalizer should also get the code to split this back into the separate opcodes
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
949 | This needs to have some kind of legalization check. We should only do this if either the combined divrem is legal or legalizable, or before legaliation | |
951–953 | This can probably be an assert, the opcode should come from the tablegen opcode list | |
979 | Should not construct single use MIRBuilders. There should be one available in the combiner state | |
llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp | ||
72–76 ↗ | (On Diff #321374) | Should be called by the generic tablegen code in include/llvm/Target/GlobalISel/Combine.td |
llvm/test/CodeGen/AMDGPU/GlobalISel/prelegalizer-combiner-divrem.mir | ||
5 | Should add some vector tests. Also cases with other uses, and negative cases with mismatched signed/unsigned pairs |
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
951 | Maybe something like this? switch(Opcode) { default: llvm_unreachable("Unexpected opcode!"); case TargetOpcode::G_SDIV: case TargetOpcode::G_UDIV: IsSigned = Opcode == TargetOpcode::G_SDIV; IsDiv = true; break; case TargetOpcode::G_SREM: case TargetOpcode::G_UREM: ... } | |
960 | Why not use Opcode? If it's a div, then Opcode == DivOpcode. If it's a rem, then Opcode == RemOpcode. | |
982 | This part should go in a MatchInfo parameter like in the other combines. That should be passed along to an apply function which should perform the actual combine. | |
989 | I think this should go in an applyCombineDivRem function like the other combines. |
I am making it a generic pattern that comes from the Combine.td. Will update the patch soon.
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
960 | Based on the signness of MI, here I am trying to get the other opcode to match. |
Addressed the review comments:
Made the combiner pattern generic. Added the necessary legalization check. Legalized the G_[SU]DIVREM instruction. Added vector tests and some negative tests. Added two new tests for post-legalizer and legalizer-helper.
Can you add an entry for G_SDIVREM and G_UDIVREM to docs/GlobalISel/GenericOpcode.rst?
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp | ||
---|---|---|
6348 | Can the Legalizer changes come in a follow-up patch? |
I will make an entry for the new opcodes in docs/GlobalISel/GenericOpcode.rst
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1031 | This patch currently performs a one-to-one match. It doesn't look for multiple div/rem users. | |
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp | ||
6348 | I thought about it initially. But it makes more sense to legalize the operations along with the patch that introduces the opcodes. |
Improved the matcher function to identify a div/rem pair and any number of matching div and/or rem instructions.
Documented the new instructions in docs/GlobalISel/GenericOpcode.rst
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1031 | I don't think this should be collecting a vector of other instructions and looking for redundant div/rems. It just needs to not crash if one of these extras exist |
Reverting the recent changes to combine any matching extra div and rem instructions alongside the divrem instruction. The presence of the extra instruction shouldn't crash. Added tests to ensure that.
Used MachineInstr as the MatchInfo instead of the Register.
With this, the other matching MI will be directly available in the apply function.
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1024 | I still suspect something is wrong with the insertion. I don't believe it's guaranteed that the first instruction you find will dominate the second, so you don't know where to place the insert point. | |
llvm/test/CodeGen/AMDGPU/GlobalISel/prelegalizer-combiner-divrem.mir | ||
382–383 | You can shorten the test by just copying tuple registers, we just lower arguments with the separate copies |
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1024 | The instructions are visited in the top-down order and MI will always dominate the otherMI. |
Addressed the review comments:
Used the dominance check to find the Insertion point.
Added a negative test when the dominance check fails to identify the insertion point.
Simplified the vector tests.
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
1024 | The combiner visits bottom up. Actually if you check that these are in the same block, you can just use OtherMI as the insert point | |
1030 | I'd rather keep all legality checks in the match function. But also I realized it should work if you check if the parent blocks are the same and then you can assume otherMI is earlier |
The div/rem pair coming from different BB will have many scenarios that can complicate the match/apply functions and it is beyond the scope of the Combiner.
I will limit the pattern to match it for the pair coming from the same BB.
This needs to have some kind of legalization check. We should only do this if either the combined divrem is legal or legalizable, or before legaliation