This is an archive of the discontinued LLVM Phabricator instance.

GlobalISel: Fix narrowScalar for load/store with different mem size
ClosedPublic

Authored by arsenm on Jan 21 2019, 9:53 AM.

Details

Summary

This was ignoring the memory size, and producing multiple loads/stores
if the operand size was different from the memory size.

I assume this is the intent of not having an explicit G_ANYEXTLOAD
(although I think that would probably be better).

Diff Detail

Event Timeline

arsenm created this revision.Jan 21 2019, 9:53 AM

In what situations can these kinds of loads be generated?

At least for AArch64, the extending loads combiner only runs before legalisation, so it wouldn't have a chance to combine these. In the absence of a G_ANYEXTLOAD opcode, I think having an explicit G_ZEXTLOAD is preferable here even though it's slightly pessimistic.

@dsanders thoughts?

In what situations can these kinds of loads be generated?

At least for AArch64, the extending loads combiner only runs before legalisation, so it wouldn't have a chance to combine these. In the absence of a G_ANYEXTLOAD opcode, I think having an explicit G_ZEXTLOAD is preferable here even though it's slightly pessimistic.

@dsanders thoughts?

I'm operating under the assumption that eventually we'll have canonicalizations that look like what SelectionDAG does today, which involves producing extloads like this.

On AMDGPU in some cases on some sub targets, there is a codegen difference between zextload and aextload so I think we should have a way to distinguish these. The current apparent representation choice I think is bug prone. I have a few patches I haven't posted yet fixing legalization bugs from inconsistently assuming the result size is the same as the memory size.

In what situations can these kinds of loads be generated?

At least for AArch64, the extending loads combiner only runs before legalisation, so it wouldn't have a chance to combine these. In the absence of a G_ANYEXTLOAD opcode, I think having an explicit G_ZEXTLOAD is preferable here even though it's slightly pessimistic.

@dsanders thoughts?

I haven't had chance to read the code (I'm juggling quite a few tasks right now) but an any-extending G_LOAD is a load extended with undefined bits. It seems reasonable for narrowScalar to just change the result type so long as the result type is at least as wide as the memory access. If it's narrower then it needs to start splitting the memory access into multiple G_LOADs too

Regarding the comment about not having a chance to combine them: I don't think it's particularly important how the extending G_LOAD came to exist in the MIR as the legalizers job is to take any/all inputs and constrain them. If the target needs to narrow them, there should be a path to do so. Regarding G_ZEXTLOAD being preferable, I think that's target specific. For example, some targets may have loads that target subregisters, effectively any-extending them. In that case, a zextload requires explicit additional code in the output to zero the remainder of the register. In general, I think it's better that the generic code tries not to constrain the code more than necessary. A given target can always map an any-extending load to a zextload if the any-extending load is pointless for them but they can't drop the zextload once it's there.

I assume this is the intent of not having an explicit G_ANYEXTLOAD
(although I think that would probably be better).

FWIW, I have a slight preference for a separate opcode but it's also quite nice to not have to check/modify the opcode when fixing up types.

aemerson accepted this revision.Jan 28 2019, 11:59 AM

LGTM then.

This revision is now accepted and ready to land.Jan 28 2019, 11:59 AM
arsenm closed this revision.Jan 29 2019, 10:12 AM

r352523