Some globals lower to literal addresses on AMDGPU.
This may be wrong for non-integral address spaces. I'm wondering if we
should just allow regular G_ADD to use pointer types, and reserve
G_PTR_ADD for non-integral address spaces.
Paths
| Differential D99033
GlobalISel: Constant fold G_PTR_ADD ClosedPublic Authored by arsenm on Mar 21 2021, 7:03 AM.
Details Summary Some globals lower to literal addresses on AMDGPU. This may be wrong for non-integral address spaces. I'm wondering if we
Diff Detail
Unit TestsFailed Event TimelineHerald added subscribers: kerbowa, hiraditya, tpr and 3 others. · View Herald TranscriptMar 21 2021, 7:03 AM Comment Actions LGTM
Maybe? We've had a few cases with AArch64 where we miss optimizations because we forget to handle both G_ADD and G_PTR_ADD. I guess that knowing immediately that we have a pointer add is nice for matching etc, but I guess we also know that from the LLT. This revision is now accepted and ready to land.Mar 23 2021, 9:53 AM Comment Actions Relaxing G_ADD semantics seems fine to me. Maybe we should still canonicalize on G_PTR_ADD during translation, and allow targets to legalize them into plain G_ADDs. Comment Actions
IRTranslator should definitely translate to G_PTR_ADD and G_PTR_ADD should not be considered equivalent to G_ADD. However for targets where pointers are just integers I see no harm in permitting G_ADD of pointers and targets custom-legalizing/lowering to that. It just shouldn't be something that happens without the targets actively choosing to do it.
Comment Actions Filter out non-integral address spaces for now, although the general handling needs more thought Comment Actions Filtering out non-integral spaces for now makes sense to me.
This revision is now accepted and ready to land.Feb 8 2022, 3:30 PM
Revision Contents
Diff 406998 llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp
llvm/lib/CodeGen/GlobalISel/Utils.cpp
llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.i16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-function-args.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-non-integral-address-spaces.ll
llvm/test/CodeGen/AMDGPU/overlapping-tuple-copy-implicit-op-failure.ll
|
probably worth a comment explaining that this needs more thought?