Split the feature into separate FeatureAtomicFaddNoRtnInsts and
FeatureAtomicFaddRtnInsts features. Previously isGFX90A was used as a
proxy for the latter.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
I do not think this is as simple as this. The support matrix of atomics would require much more individual bits as gfx940 adds new variants.
It is also synchronized with clang buitin features and then device library feature use.
I do not think this is as simple as this.
Is there something wrong with the patch? Really all it is doing is:
- Setting two predicates on an instruction, like HasAtomicFaddInsts and isGFX940Plus, instead of using compound predicates like HasAtomicFaddInstsGFX940. The hope here is to avoid an N^2 explosion of named predicates.
- Using HasAtomicFaddNoRtnInsts instead of isGFX90APlus in a couple of places where it makes things more symmetrical.
I could even split this into two separate patches.
llvm/lib/Target/AMDGPU/FLATInstructions.td | ||
---|---|---|
812 | Here I switched to using HasAtomicFaddRtnInsts for symmetry with the HasAtomicFaddNoRtnInsts block just above. | |
1633–1634 | Here I'm only setting the SubtargetPredicate, because I can relying on the Real instruction inheriting the Pseudo instruction's OtherPredicates to pick up the HasAtomicFaddInsts part of the condition. |
llvm/lib/Target/AMDGPU/BUFInstructions.td | ||
---|---|---|
2547 | Real instructions copy OtherPredicates from their pseudo, and the pseudos already have the right OtherPredicates = [HasAtomicFaddXXXInsts]. | |
llvm/lib/Target/AMDGPU/FLATInstructions.td | ||
1601–1602 | Yeah, I agree this looks strange. I can clean it up if you like. I think we get away with it because these instructions also pick up OtherPredicates = [HasAtomicFaddXXXInsts] from their pseudos, and those features are never set on GFX8. |
llvm/lib/Target/AMDGPU/FLATInstructions.td | ||
---|---|---|
1601–1602 | I can replace this with isGFX908orGFX90A. I think it makes the patch cleaner, but it causes test/CodeGen/AMDGPU/global-atomics-fp-wrong-subtarget.ll to fail because it does this: ; Make sure we can encode and don't fail on functions which have ; instructions not actually supported by the subtarget. I have no idea why it does that, or whether this is still required to work for some reason, but at least the current version of this patch preserves this behaviour. |
Really all it is doing is:
- Setting two predicates on an instruction, like HasAtomicFaddInsts and isGFX940Plus, instead of using compound predicates like HasAtomicFaddInstsGFX940. The hope here is to avoid an N^2 explosion of named predicates.
- Using HasAtomicFaddNoRtnInsts instead of isGFX90APlus in a couple of places where it makes things more symmetrical.
I could even split this into two separate patches.
D121289 is a simpler patch that just does #1. Hopefully this avoids the concern about changing feature names.
Why is this predicate dropped?