This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Remove function if FeatureWavefrontSize 32 is not supported on current GPU
ClosedPublic

Authored by skc7 on Apr 21 2023, 1:56 AM.

Details

Summary

Currently, wavfrontsize32 is being appended by device-libs to some functions for gfx9 targets.
For gfx9 and below targets, this change deletes the incompatible function, since FeatureWavefrontSize32 is not supported.

Diff Detail

Event Timeline

skc7 created this revision.Apr 21 2023, 1:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2023, 1:56 AM
skc7 requested review of this revision.Apr 21 2023, 1:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2023, 1:56 AM
foad added reviewers: Pierre-vh, jmnash, Restricted Project.Apr 21 2023, 4:00 AM

Currently, wavfrontsize32 is being appended by device-libs to some functions for gfx9 targets.

Why? That seems wrong.

Currently, wavfrontsize32 is being appended by device-libs to some functions for gfx9 targets.

Why? That seems wrong.

It is wrong but that’s how the library is currently structured. All code coexists in the same IR, and there are some wave32 only functions

Currently, wavfrontsize32 is being appended by device-libs to some functions for gfx9 targets.

Why? That seems wrong.

It is wrong but that’s how the library is currently structured. All code coexists in the same IR, and there are some wave32 only functions

One concern is whether removing +wavefrontsize32 will make them work. They may contain an instruction sequence that only works with wavefront size 32.

If device libs doesn't want to change that, we could provide an attribute or similar that device libs annotates functions with. Wave32-only or similar, with semantics that it gets erased by clang without warning on wave64. Would be as if that function never existed on wave64 targets. Might need it to apply to objects as well, depending on how much of this stuff device-libs contains.

foad added a comment.Apr 21 2023, 7:26 AM

Currently, wavfrontsize32 is being appended by device-libs to some functions for gfx9 targets.

Why? That seems wrong.

It is wrong but that’s how the library is currently structured. All code coexists in the same IR, and there are some wave32 only functions

But if you're compiling for gfx900 then this pass should remove any functions that are marked with +wavefrontsize32. It should not just remove the +wavefrontsize32 from the function.

Currently, wavfrontsize32 is being appended by device-libs to some functions for gfx9 targets.

Why? That seems wrong.

It is wrong but that’s how the library is currently structured. All code coexists in the same IR, and there are some wave32 only functions

But if you're compiling for gfx900 then this pass should remove any functions that are marked with +wavefrontsize32. It should not just remove the +wavefrontsize32 from the function.

I thought we had agreed that we were going to eliminate the body of incompatible functions and that had been implemented. If that is not enough, then maybe we should strip the problematic attributes of such functions as well?

foad added a comment.Apr 21 2023, 8:03 AM

It looks like you just want a one-liner patch that adds FeatureWavefrontSize32 to the FeaturesToCheck table?

If device libs doesn't want to change that, we could provide an attribute or similar that device libs annotates functions with. Wave32-only or similar, with semantics that it gets erased by clang without warning on wave64. Would be as if that function never existed on wave64 targets. Might need it to apply to objects as well, depending on how much of this stuff device-libs contains.

device libs may be linked in by other programs or libraries, e.g. comgr.

device libs may be linked in by other programs or libraries, e.g. comgr.

That used to be broken. Clang mlink-builtin-bitcode applies attributes to functions in device libs that were both required and missing from device libs. Caused a bunch of pain in openmp because we really wanted to link device libs at link time, not into every TU up front in clang. Has someone fixed that requirement?

It looks like you just want a one-liner patch that adds FeatureWavefrontSize32 to the FeaturesToCheck table?

Indeed, if your goal is just to remove the functions that have +wavefrontsize32 on targets that doesn't support it, this is all you need to do.

I'm also not a fan of removing attributes in this pass. The pass is documented as "removes any function with incompatible attributes". If it also starts removing attributes then its description (& the CL option) would need to be updated to reflect the change (but really, I think this should be done somewhere else, perhaps in yet another pass).

device libs may be linked in by other programs or libraries, e.g. comgr.

That used to be broken. Clang mlink-builtin-bitcode applies attributes to functions in device libs that were both required and missing from device libs. Caused a bunch of pain in openmp because we really wanted to link device libs at link time, not into every TU up front in clang. Has someone fixed that requirement?

OpenCL uses AMDGPUSimplifyLibCalls with -amdgpu-prelink, which transforms some calls of device lib functions into calls of new device lib functions, e.g. sincos. Since this happens after clang's -mlink-builtin-bitcode, these newly introduced device lib functions need to be linked in by comgr directly.

But that is not the only situation that device libs are linked by non-clang programs. There are other FE's that generate IR for amdgpu and need link with device libs, e.g. XLA.

That said, I think the question is that whether simply removing +wavefrontsize32 makes a function valid for wave64. There must be a reason that a function in device lib is marked with +wavefrontsize32. I assume that a function marked with +wavefrontsize32 means it can only be used with wave32. If that is true, then such functions should just be treated as incompatible functions and treated the same way.

I think a function with wavefrontsize32 is incompatible with a wavefrontsize64 compilation and similarly for a wavefrontsize64 function with a wavefrontsize32 compilation and we should take the usual steps for incompatible functions.

skc7 updated this revision to Diff 518152.Apr 29 2023, 2:57 AM
skc7 retitled this revision from [AMDGPU] Remove incompatible attributes of function in amdgpu-remove-incompatible-functions pass to [AMDGPU] Remove function if FeatureWavefrontSize32 is not supported on current GPU.
skc7 edited the summary of this revision. (Show Details)

Updated patch to delete the function if FeatureWavefrontSize32 is not supported by current GPU.

I'd need to think harder about whether it's ok to simply remove functions with wavefrontsize32 if you aren't compiling for wave32. It's not quite like the other features, as it behaves more like a selectable mode. I would expect we delete wave32 functions for <gfx10 targets that don't support the mode

llvm/test/CodeGen/AMDGPU/remove-incompatible-wave32-feature.ll
27

-NOT checks are fragile and should be as loose as possible. Drop the "define void" and trailing (

I think we need to similarly remove a wavefrontsize64 function if compiling in wave32 mode.

skc7 updated this revision to Diff 518409.May 1 2023, 1:24 AM
skc7 retitled this revision from [AMDGPU] Remove function if FeatureWavefrontSize32 is not supported on current GPU to [AMDGPU] Remove function if FeatureWavefrontSize 32/64 is not supported on current GPU.
skc7 edited the summary of this revision. (Show Details)
skc7 set the repository for this revision to rG LLVM Github Monorepo.

Remove wavefrontsize64 function if compiling in wave32 mode.

skc7 added inline comments.May 1 2023, 1:35 AM
llvm/lib/Target/AMDGPU/AMDGPURemoveIncompatibleFunctions.cpp
160

@arsenm AFAIU, This condition checks that (for ex.) if wave32 mode is enabled and current GPU doesn't support waveforntsize32 feature, then that function should be deleted.
This inherently fulfills the condition "delete wave32 functions for <gfx10 targets that don't support the mode" ? Please let me know if I'm missing anything here.

skc7 updated this revision to Diff 519730.May 4 2023, 8:10 PM
skc7 retitled this revision from [AMDGPU] Remove function if FeatureWavefrontSize 32/64 is not supported on current GPU to [AMDGPU] Remove function if FeatureWavefrontSize 32 is not supported on current GPU.
skc7 edited the summary of this revision. (Show Details)

expandImpliedFeatures gets all the features implied by current GPU. But wavefrontsize32 and 64 are not part of the feature list for gfx10 and above targets in AMDGPU.td. AFAIU, It is assumed those subtarget supports both the features if they are not part of the list. So GPUFeatureBits cannot be relied on to query FeatureWavefrontSize32/64.

In the latest revision, made changes to delete FeatureWavefrontSize32 functions for gfx9 and below targets that don't support the mode

Pierre-vh added inline comments.May 4 2023, 11:34 PM
llvm/lib/Target/AMDGPU/AMDGPURemoveIncompatibleFunctions.cpp
166–167

Can you explain in the comment why we need to check this separately?
Just saying "GFX10+ is implied to support both wave32 and 64, they are not in the feature set, so we need a separate check" is enough

168

IMO there is no need to make a separate bool, just inline it in the if, but no strong preference here (though make the bool const if you keep it)

168–169

IMO the first variable isn't really needed, just inline the check in the if. The second one could also just be inlined because the feature won't change, but no strong preference here.

If you keep either/both of them, make them const though.

171–177

Moving this into a helper (e.g. reportFunctionRemoved(Fn, Feature)) would be better than duplicating the code

foad added a comment.May 5 2023, 2:33 AM

I think we need to similarly remove a wavefrontsize64 function if compiling in wave32 mode.

I think this is a reasonable request but it's an extension to what this pass currently does.

  • Currently the pass removes functions that rely on features not supported by the -mcpu= GPU.
  • The extension would be to remove functions that rely on features not supported by the -mcpu= GPU as modified by any -mattr= options.

Does that make sense? If so can we please implement that in a generic way instead of having a special hack for wave32/wave4 features?

skc7 updated this revision to Diff 519889.May 5 2023, 9:02 AM
skc7 set the repository for this revision to rG LLVM Github Monorepo.

Updated patch as per feedback from @Pierre-vh

b-sumner added a comment.EditedMay 5 2023, 10:38 AM

I think we need to similarly remove a wavefrontsize64 function if compiling in wave32 mode.

I think this is a reasonable request but it's an extension to what this pass currently does.

  • Currently the pass removes functions that rely on features not supported by the -mcpu= GPU.
  • The extension would be to remove functions that rely on features not supported by the -mcpu= GPU as modified by any -mattr= options.

Does that make sense? If so can we please implement that in a generic way instead of having a special hack for wave32/wave4 features?

It makes sense to me. The library doesn't need it currently, but there has been a use case to detect CUmode in the headers, and that could arrive in the library where it would be useful to rely on either the CUmode or WGPmode version of a function to be removed depending on the mode.

skc7 updated this revision to Diff 521560.May 11 2023, 10:42 PM

Rebase.

skc7 added inline comments.May 19 2023, 9:36 AM
llvm/lib/Target/AMDGPU/AMDGPURemoveIncompatibleFunctions.cpp
168–169

Updated the comments. Thanks for feedback.

171–177

Moved it to reportFunctionRemoved. Thanks for feedback.

llvm/test/CodeGen/AMDGPU/remove-incompatible-wave32-feature.ll
27

Fixed it. Drop the "define void" and trailing (

arsenm accepted this revision.May 19 2023, 10:08 AM
arsenm added inline comments.
llvm/test/CodeGen/AMDGPU/remove-incompatible-wave32-feature.ll
31–33

Note these functions don't really "require" wave 32. They will codegen just fine without it

This revision is now accepted and ready to land.May 19 2023, 10:08 AM