This was ignoring the memory size, and producing multiple loads/stores
if the operand size was different from the memory size.
I assume this is the intent of not having an explicit G_ANYEXTLOAD
(although I think that would probably be better).
Differential D57029
GlobalISel: Fix narrowScalar for load/store with different mem size arsenm on Jan 21 2019, 9:53 AM. Authored by
Details
This was ignoring the memory size, and producing multiple loads/stores I assume this is the intent of not having an explicit G_ANYEXTLOAD
Diff Detail Event TimelineComment Actions In what situations can these kinds of loads be generated? At least for AArch64, the extending loads combiner only runs before legalisation, so it wouldn't have a chance to combine these. In the absence of a G_ANYEXTLOAD opcode, I think having an explicit G_ZEXTLOAD is preferable here even though it's slightly pessimistic. @dsanders thoughts? Comment Actions I'm operating under the assumption that eventually we'll have canonicalizations that look like what SelectionDAG does today, which involves producing extloads like this. On AMDGPU in some cases on some sub targets, there is a codegen difference between zextload and aextload so I think we should have a way to distinguish these. The current apparent representation choice I think is bug prone. I have a few patches I haven't posted yet fixing legalization bugs from inconsistently assuming the result size is the same as the memory size. Comment Actions I haven't had chance to read the code (I'm juggling quite a few tasks right now) but an any-extending G_LOAD is a load extended with undefined bits. It seems reasonable for narrowScalar to just change the result type so long as the result type is at least as wide as the memory access. If it's narrower then it needs to start splitting the memory access into multiple G_LOADs too Regarding the comment about not having a chance to combine them: I don't think it's particularly important how the extending G_LOAD came to exist in the MIR as the legalizers job is to take any/all inputs and constrain them. If the target needs to narrow them, there should be a path to do so. Regarding G_ZEXTLOAD being preferable, I think that's target specific. For example, some targets may have loads that target subregisters, effectively any-extending them. In that case, a zextload requires explicit additional code in the output to zero the remainder of the register. In general, I think it's better that the generic code tries not to constrain the code more than necessary. A given target can always map an any-extending load to a zextload if the any-extending load is pointless for them but they can't drop the zextload once it's there.
FWIW, I have a slight preference for a separate opcode but it's also quite nice to not have to check/modify the opcode when fixing up types. |