This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][GISel] Enable Selection of ADD3 for G_PTR_ADD
ClosedPublic

Authored by Pierre-vh on Aug 5 2022, 4:46 AM.

Details

Summary

Allows things like (G_PTR_ADD (G_PTR_ADD a, b), c) to be
simplified into a single ADD3 instruction instead of two adds.

Diff Detail

Event Timeline

Pierre-vh created this revision.Aug 5 2022, 4:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 5 2022, 4:46 AM
Pierre-vh requested review of this revision.Aug 5 2022, 4:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 5 2022, 4:46 AM
arsenm added inline comments.Aug 5 2022, 5:59 AM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
3438–3451

I'd prefer to avoid pre-hacking the MIR or introducing new instructions here and have tablegen handle everything

Pierre-vh updated this revision to Diff 451467.Aug 10 2022, 8:00 AM

Attempt different approach through tablegen/GISel internals

Pierre-vh planned changes to this revision.Aug 10 2022, 8:05 AM
  • Need to fix the GISelPredicateCode thing. I need to understand why the Cxx predicate is emitted before the feature check with those changes.
  • Need to reevaluate the whole AllowPointerOperands approach, it feels awfully hacky but I'm not sure of what's the alternative.
  • The whole impl in addTypeCheckPredicate also feels hacky so I should take another look at it and check if it can be simplified.
  • Need to determine if G_PTR_ADD (G_ADD) folding is something desirable? It happens now because the parent and child pattern both have 2 opcodes possibles. We could enforce them to match somehow (maybe by adding another predicate of sorts).
Pierre-vh updated this revision to Diff 452183.Aug 12 2022, 7:57 AM

Refactor to remove added flags

Pierre-vh edited the summary of this revision. (Show Details)Aug 12 2022, 7:57 AM
Pierre-vh updated this revision to Diff 454499.Aug 22 2022, 7:24 AM

Rebase, cleanup a bit

foad added a comment.Aug 23 2022, 6:43 AM

Would it make sense to match (ptradd a, (add, b, c)) as well as (ptradd (ptradd a, b), c) ?

llvm/lib/Target/AMDGPU/VOP3Instructions.td
452

Did you add trailing whitespace?

458

Should be "FIXME". But it would be good to understand and fix this before committing.

460

Space after if

594

Remove "_i32_" from the name now that it can handle other types too?

Pierre-vh updated this revision to Diff 455110.Aug 24 2022, 2:06 AM
Pierre-vh marked 4 inline comments as done.

Comments

llvm/lib/Target/AMDGPU/VOP3Instructions.td
458

I'd also like to understand why too but so far I haven't found an answer. It seems like the GlobalISelEmitter is just emitting the feature check in the operand instruction matcher and not in the "parent" block.

Maybe @arsenm has an idea?

foad requested changes to this revision.Aug 24 2022, 2:54 AM

With your patch, tablegen crashes trying to build the <Target>GenGlobalISel.inc file for X86 and PowerPC.

llvm/lib/Target/AMDGPU/VOP3Instructions.td
461

Trailing whitespace on this line!

This revision now requires changes to proceed.Aug 24 2022, 2:54 AM
Pierre-vh updated this revision to Diff 455131.Aug 24 2022, 3:17 AM
Pierre-vh marked an inline comment as done.

Trailing whitespace, build issue

Apologies for the silly whitespace issue, I was using a new editor while waiting for my workstation to arrive and forgot to enable trailing whitespace trimming.
For the build issue, I was only building the AMDGPU backend so that's a silly mistake too. I'll make sure to check all backends in the future when I make changes like this.

As for

(ptradd a, (add, b, c)) as well as (ptradd (ptradd a, b), c) ?

I think @arsenm had some thoughts on the former (we should canonicalize one way or the other). The latter should fold with those patterns though, no?

def : ThreeOp_Pats <ptradd, ptradd, V_ADD3_U32_e64, [p3, i32, i32]>;
def : ThreeOp_Pats <ptradd, ptradd, V_ADD3_U32_e64, [p5, i32, i32]>;
foad added inline comments.Aug 24 2022, 4:21 AM
llvm/lib/Target/AMDGPU/VOP3Instructions.td
458

Right, it seems like a GlobalISelEmitter bug, like it is hoisting the predicate check call but does not realise that the predicate uses Operands[], so it would also need to hoist the GIM_RecordNamedOperands. There are some comments that might be relevant in GroupMatcher::candidateConditionMatches (about GIM_RecordInsn not GIM_RecordNamedOperand). But I don't really understand GlobalISelEmitter too well.

602

The globalisel pattern importer ignores the address space here. See the comment on class PointerToAnyOperandMatcher. So you only need one of these lines, and it will match any address space not just 3 or 5. And furthermore if you remove one of these lines then it stops GlobalISelEmitter from hoisting the predicate check call, so you don't need the FIXME above.

Pierre-vh updated this revision to Diff 455169.Aug 24 2022, 5:14 AM
Pierre-vh marked an inline comment as done.

Simplifying code.
Now that G_PTR_ADD has a proper pointer type in the Out operands, we don't even need to put p3/p5 in the pattern - i32 is enough.

foad accepted this revision.Aug 24 2022, 5:42 AM

LGTM, thanks!

This revision is now accepted and ready to land.Aug 24 2022, 5:42 AM
This revision was automatically updated to reflect the committed changes.