This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Create Subtarget Features for some of 16 bits atomic fadd instructions
ClosedPublic

Authored by mariusz-sikora-at-amd on Mar 23 2023, 1:58 AM.

Details

Summary

Introducing Subtarget Features for instructions:

  • ds_pk_add_bf16
  • ds_pk_add_f16
  • ds_pk_add_rtn_bf16
  • ds_pk_add_rtn_f16
  • flat_atomic_pk_add_f16
  • flat_atomic_pk_add_bf16
  • global_atomic_pk_add_f16
  • global_atomic_pk_add_bf16
  • buffer_atomic_pk_add_f16

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2023, 1:58 AM
mariusz-sikora-at-amd requested review of this revision.Mar 23 2023, 1:58 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 23 2023, 1:58 AM
foad added inline comments.Mar 23 2023, 3:14 AM
clang/include/clang/Basic/BuiltinsAMDGPU.def
233

So __builtin_amdgcn_ds_atomic_fadd_v2f16 is missing here? (Just curious- I am not asking you to add it in this patch.)

llvm/lib/Target/AMDGPU/BUFInstructions.td
2888–2889

Could remove the braces if you prefer - then you don't need the "End" comment either.

llvm/lib/Target/AMDGPU/FLATInstructions.td
1914

Generally Real instructions copy their predicates from the corresponding Pseudo, so this should not be required here. Please check the other places where you have added predicates to Real instructions too.

clang/include/clang/Basic/BuiltinsAMDGPU.def
233

I have a different change which will add only __builtin_amdgcn_ds_atomic_fadd_v2f16 (not pushed yet, because it depends on this change).

llvm/lib/Target/AMDGPU/BUFInstructions.td
2888–2889

So, as I understand from other comment:

Generally Real instructions copy their predicates from the corresponding Pseudo, so this should not be required here. Please check the other places where you have added predicates to Real instructions too.

We do not need this (L2889) Predicate, because it was added to Pseudo Instruction ?

foad added inline comments.Mar 23 2023, 6:13 AM
llvm/lib/Target/AMDGPU/BUFInstructions.td
2888–2889

Correct. See the places commented "copy relevant pseudo op flags" in this file.

Changes after review.

mariusz-sikora-at-amd marked 3 inline comments as done and an inline comment as not done.Mar 23 2023, 6:38 AM
foad added inline comments.Mar 23 2023, 6:52 AM
llvm/lib/Target/AMDGPU/BUFInstructions.td
2891

Is this still required?

llvm/lib/Target/AMDGPU/FLATInstructions.td
1913

Are these changes (from here to the end of the file) still required?

llvm/lib/Target/AMDGPU/FLATInstructions.td
1913

Not sure if I understand what you mean. Could you please elaborate more ? Are you referring to the fact that both flat_atomic and global_atomic have FLAT encoding and could be unified ?
I thought this is required, but now you got me thinking ...

llvm/lib/Target/AMDGPU/BUFInstructions.td
2891

No. We can remove this. But I wanted to limit this change only to atomic f16/bf16 and not going deeper

foad added inline comments.Mar 23 2023, 9:48 AM
llvm/lib/Target/AMDGPU/BUFInstructions.td
2891

OK. I thought this was something you added in this patch, but now I see it is just moved around.

llvm/lib/Target/AMDGPU/FLATInstructions.td
1913

I don't understand why the changes from here to the end of the file are required. It looks like you have just moved some definitions around, so that they no longer have a SubtargetPredicate applied. Is that correct? Why?

llvm/lib/Target/AMDGPU/FLATInstructions.td
1913

Yes, I moved them out from under the SubtargetPredicates. All tests are passing. Even 'negative' for unsupported generations, but after deeper looking into 'gen-instr-info' from TableGen I see that there is a difference for GLOBAL_ATOMIC_PK_ADD_F16_*_vi definitions of Real Instructions.

  1. I'm still convinced that SubtargetPredicate = isGFX940Plus is not needed for definitions which inherit from FLAT_Global_Real_Atomics_gfx940. Multiclass sets AssemblerPredicate = isGFX940Plus and when looking how Predicates list is created we can see that AssemblerPredicate is used in creation of final list of Predicates.

    list<Predicate> Predicates = PredConcat< PredConcat<PredConcat<OtherPredicates, SubtargetPredicate>.ret, AssemblerPredicate>.ret, WaveSizePredicate>.ret;
  1. In case of FLAT_Global_Real_Atomics_vi I see that SubtargetPredicate = isGFX8GFX9NotGFX940 will be required, because AssemblerPredicate is isGFX8GFX9;

I wonder why everything works event when not having these SubtargetPredicates = isGFX8GFX9NotGFX940

I will revert these changes and move definitions back under the SubtargetPredicates.

Put back Real Instructions under SubtargetPredicate associated with gfx generation.

foad accepted this revision.Mar 24 2023, 3:57 AM

LGTM, thanks!

If you want to remove some of the other unnecessary predicates from Real instructions you could do that in a separate NFC patch.

This revision is now accepted and ready to land.Mar 24 2023, 3:57 AM
This revision was landed with ongoing or failed builds.Mar 24 2023, 5:12 AM
This revision was automatically updated to reflect the committed changes.