Allows things like (G_PTR_ADD (G_PTR_ADD a, b), c) to be
simplified into a single ADD3 instruction instead of two adds.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3438–3451 | I'd prefer to avoid pre-hacking the MIR or introducing new instructions here and have tablegen handle everything |
- Need to fix the GISelPredicateCode thing. I need to understand why the Cxx predicate is emitted before the feature check with those changes.
- Need to reevaluate the whole AllowPointerOperands approach, it feels awfully hacky but I'm not sure of what's the alternative.
- The whole impl in addTypeCheckPredicate also feels hacky so I should take another look at it and check if it can be simplified.
- Need to determine if G_PTR_ADD (G_ADD) folding is something desirable? It happens now because the parent and child pattern both have 2 opcodes possibles. We could enforce them to match somehow (maybe by adding another predicate of sorts).
Would it make sense to match (ptradd a, (add, b, c)) as well as (ptradd (ptradd a, b), c) ?
llvm/lib/Target/AMDGPU/VOP3Instructions.td | ||
---|---|---|
452 | Did you add trailing whitespace? | |
458 | Should be "FIXME". But it would be good to understand and fix this before committing. | |
460 | Space after if | |
594 | Remove "_i32_" from the name now that it can handle other types too? |
With your patch, tablegen crashes trying to build the <Target>GenGlobalISel.inc file for X86 and PowerPC.
llvm/lib/Target/AMDGPU/VOP3Instructions.td | ||
---|---|---|
461 | Trailing whitespace on this line! |
Apologies for the silly whitespace issue, I was using a new editor while waiting for my workstation to arrive and forgot to enable trailing whitespace trimming.
For the build issue, I was only building the AMDGPU backend so that's a silly mistake too. I'll make sure to check all backends in the future when I make changes like this.
As for
(ptradd a, (add, b, c)) as well as (ptradd (ptradd a, b), c) ?
I think @arsenm had some thoughts on the former (we should canonicalize one way or the other). The latter should fold with those patterns though, no?
def : ThreeOp_Pats <ptradd, ptradd, V_ADD3_U32_e64, [p3, i32, i32]>; def : ThreeOp_Pats <ptradd, ptradd, V_ADD3_U32_e64, [p5, i32, i32]>;
llvm/lib/Target/AMDGPU/VOP3Instructions.td | ||
---|---|---|
458 | Right, it seems like a GlobalISelEmitter bug, like it is hoisting the predicate check call but does not realise that the predicate uses Operands[], so it would also need to hoist the GIM_RecordNamedOperands. There are some comments that might be relevant in GroupMatcher::candidateConditionMatches (about GIM_RecordInsn not GIM_RecordNamedOperand). But I don't really understand GlobalISelEmitter too well. | |
602 | The globalisel pattern importer ignores the address space here. See the comment on class PointerToAnyOperandMatcher. So you only need one of these lines, and it will match any address space not just 3 or 5. And furthermore if you remove one of these lines then it stops GlobalISelEmitter from hoisting the predicate check call, so you don't need the FIXME above. |
Simplifying code.
Now that G_PTR_ADD has a proper pointer type in the Out operands, we don't even need to put p3/p5 in the pattern - i32 is enough.
I'd prefer to avoid pre-hacking the MIR or introducing new instructions here and have tablegen handle everything