Diff Detail
Event Timeline
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir | ||
---|---|---|
15738 | Why does this extend to s64 for the shifting and ORing? |
Use correct power of 2, not just ceil. Handling the case where the widened type is the same or not is more annoying than it should be
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir | ||
---|---|---|
15995–16020 | Why is this so much more convoluted than the GFX9-HSA case? The loads look the same, it's just all the shifting and ORing afterwards that looks crazy here. |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir | ||
---|---|---|
15995–16020 | This is an artifact from the terrible way we currently handle unaligned accesses. We treat it as a narrowScalar action, which doesn't really make sense. I'm trying to move towards making widenScalar/narrowScalar only touch the register size, and leave the memory access alone. Unaligned access decomposition is a kind of lowering, and only tangentially related to the register types needed after legalization. The HSA case enables unaligned access and the mesa case doesn't, so we start out by reporting we need to narrow the s32 result to s24. When that load is legalized, it ends up producing this mess. Once lowering handles alignment decomposition they should look the same |
clang-format: please reformat the code