This one is a bit weird due to the interaction with the implied range
from amdgpu-flat-workgroup-size. At the default group range of 1,1024,
the minimum implied bounds is 4 so this ends up introducing the
attribute on undecorated functions.
Details
Diff Detail
Event Timeline
llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp | ||
---|---|---|
651 | Is it on purpose that you use the assumed/initial value of the flat-work-group-size as known range here? |
llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp | ||
---|---|---|
651 | Yes, but I wasn't completely sure if this was the correct way to handle this. The amdgpu-flat-workgroup-size implies restrictions on amdgpu-waves-per-eu, but amdgpu-flat-workgroup-size should win if there's a conflict. There should always be an initially known range |
llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp | ||
---|---|---|
651 | So, the way it is now you will never go below the initial flat-workgroup-size. From what I read it seems like you want an upper bound? You might also need to consider doing this in update if you need to be bound by the final flat-workgroup-size and not only the initial one. |
Is it on purpose that you use the assumed/initial value of the flat-work-group-size as known range here?