The backend default maximum should be the hardware maximum, so the
frontend should set the implementation defined default maximum.
Details
Diff Detail
Event Timeline
lib/CodeGen/TargetInfo.cpp | ||
---|---|---|
7947 | Theoretically, shouldn't the minimum be 1? |
lib/CodeGen/TargetInfo.cpp | ||
---|---|---|
7947 | That's what I thought, but the backend is defaulting to 2 * wave size now |
lib/CodeGen/TargetInfo.cpp | ||
---|---|---|
7947 | I don't get it. This attribute indicates the possible workgroup size range this kernel may be run with, right? It only depends on how user execute the kernel. How is it related to backend defaults? |
lib/CodeGen/TargetInfo.cpp | ||
---|---|---|
7947 | The backend currently assumes 128,256 by default as the bounds. I want to make this a frontend decision, and make the backend assumption the most conservative default |
My concern is that this essentially forcing user to add amdgpu_flat_work_group_size attribute to all kernels that are executed outside of (128,256). Potentially this can cause lots of regressions for existing OpenCL apps. I am not sure if it is feasible to force all OpenCL apps to make this change. Should we do some tests before making this change?
We need to communicate with anyone generating IR to ensure this is being generated before we change the default. clang is only one of those generators. This change will also need to be documented in the usage document.
This is already the case. This is just moving it to the frontend. There's no user observable change from this patch
The planned change is to make the backend more conservative, so it shouldn't break other frontends
I do want to reduce the minimum bound to 1 at least (which is the default for graphics shaders), but that's a separate change
It may not break other frontends, but could cause substantial performance regressions. At a minimum the summary should clearly mention this possibility.
lib/CodeGen/TargetInfo.cpp | ||
---|---|---|
7947 | I would agree that minimum should be 1. Backend's choice is also unclear, but emitting a false attribute is a separate issue. |
Theoretically, shouldn't the minimum be 1?