Not all address spaces support initializers for globals and we can
therefore not set them without checking if they are allowed. This
patch adds a hook into TTI to check if an AS allows non-undef
initializers. We disable it for all but address space 0 by default,
NVPTX and AMDGPU targets allow all but address space 3.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
For NVPTX, it's the __shared__ AS that's the odd one. Static initializers do work for __device__ and __constant__ globals.
I think this may be something we should plumb via TTI with AS(0) being the only AS allowed by default.
llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.h | ||
---|---|---|
52 | While we should not have globals with ADDRESS_SPACE_LOCAL or ADDRESS_SPACE_PARAM, it may be worth to return false for them, too. | |
llvm/test/Transforms/GlobalOpt/address_space_initializer.ll | ||
2 | You may want to add another run for NVPTX with non-default AS and check that we do create the initializer. |
llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.h | ||
---|---|---|
52 | That sounds to me more like an assertion. I'll add one for now, you let me know if that is ok too. |
llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.h | ||
---|---|---|
52 | Assertions are for compiler programming errors, not for diagnosing invalid user-provided input. |
llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.h | ||
---|---|---|
52 | TBH, I think such an input is broken and the verifier should trigger, which it does not: https://godbolt.org/z/xr1v361Wo |
llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.h | ||
---|---|---|
52 | Right. TTI just does not have enough context to tell whether this particular call indicates a problem. Perhaps the best we can do here is to just provide the answer whether the AS can/can't handle global initializers and leave it up to the callers to deal with overall IR correctness. In this particular patch, returning false for local/param AS would be valid. It will block the optimization and that would the the correct action regardless of soundness of the IR we're optimizing. Improving target-specific IR checks in the verifier would be nice, but it's beyond the scope of this patch and should be done separately. |
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h | ||
---|---|---|
187 | Should also disallow REGION_ADDRESS, and PRIVATE_ADDRESS (we don't even allow private globals) |
llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.h | ||
---|---|---|
52 | Nit: Making it a switch and enumerating all known AS would make it harder to give a wrong answer if/when a new AS is added. |
llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.h | ||
---|---|---|
52 | The AMD enum has dozens of members :( and we already do these checks in other places. I'd leave it like this for now if it's ok. |
llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.h | ||
---|---|---|
52 | OK. This is fine, too. |
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h | ||
---|---|---|
109 | I'm not sure this should be so strict by default. At the minimum it should probably allow the datalayout's default global address space |
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h | ||
---|---|---|
109 | I have no strong feelings either way. This seems more cautious and maybe we want targets to overwrite it explicitly? That said, feel free to patch it. |
I'm not sure this should be so strict by default. At the minimum it should probably allow the datalayout's default global address space