This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Assume all amdhsa kernarg passed implicit arguments by default
ClosedPublic

Authored by arsenm on Oct 25 2021, 2:09 PM.

Details

Summary

Previously we would require adding an attribute to kernels to enable
the inputs passed in the kernarg segment, accessed by
llvm.amdgcn.implicitarg.ptr. This violates the principle of being
correct by default. Some OpenMP testcases were broken recently since
it wasn't correctly setting this attribute, and no known frontends are
setting this to anything other than the maximum.

Most of the test changes are from load widening of argument loads
since there now more implied dereferenceable bytes.

Diff Detail

Event Timeline

arsenm created this revision.Oct 25 2021, 2:09 PM
arsenm requested review of this revision.Oct 25 2021, 2:09 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: sstefan1, wdng. · View Herald Transcript
rampitec added inline comments.Oct 25 2021, 2:17 PM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
661

It is not immediately obvious we have a HSA kernel at this point.

arsenm added inline comments.Oct 25 2021, 2:23 PM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
661

This is only meaningfully called with a kernel

rampitec added inline comments.Oct 25 2021, 2:25 PM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
661

With MESA ruled out does that mean only a HSA kernel remains? Maybe add at least an assert?

arsenm updated this revision to Diff 382161.Oct 25 2021, 6:31 PM

Add assert

rampitec added inline comments.Oct 26 2021, 11:06 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
661

Is this also true for SPIR_KERNEL?

arsenm added inline comments.Oct 26 2021, 11:12 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
661

We never distinguished them (although SPIR_KERNEL should never really be seen, but nobody ever implemented SPIR correctly)

rampitec accepted this revision.Oct 26 2021, 11:14 AM
This revision is now accepted and ready to land.Oct 26 2021, 11:14 AM

Correct by default the optimise to elide is much safer, thank you

Can we land this? Seems likely to step on many latent bugs all at once

Can we land this? Seems likely to step on many latent bugs all at once

The only holdup is surviving openmp precheckin, which is mostly complicated due to a reverted commit

Can we land this? Seems likely to step on many latent bugs all at once

The only holdup is surviving openmp precheckin, which is mostly complicated due to a reverted commit

D113538 should help unblock the unrevert, which will unblock this patch

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-abi-attribute-hints.ll