HomePhabricator

GlobalISel: Do not change register types in lowerLoad

Authored by arsenm on May 18 2021, 2:05 PM.

Description

GlobalISel: Do not change register types in lowerLoad

Adjusting the load register type is a widenScalar type action, not a
lowering. lowerLoad should be reserved for operations that change the
memory access size, such as unaligned load decomposition. With this
trying to adjust the register type, it was hard to avoid infinite
loops in the legalizer. Adds a bandaid to avoid regressing a few
AArch64 tests, but I'm not sure what the exact condition is and
there's probably a cleaner way to do this.

For AMDGPU this regresses handling of some cases for unaligned loads,
but the way this is currently working is a pretty ugly hack.

Details

Committed
arsenmMay 27 2021, 8:49 AM
Parents
rG7922ff601094: [AIX] Add -lc++abi and -lunwind for linking
Branches
Unknown
Tags
Unknown

Event Timeline

Herald added a subscriber: Restricted Project. · View Herald TranscriptMay 27 2021, 8:49 AM
dsanders added inline comments.
/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
2756

Hmm, this path isn't lowering extending loads AFAICT. It appears to be narrowScalar'ing the MMO type. Lower is supposed to stop using the operation that it's lowering but lowering a non-extending G_LOAD still results in a non-extending G_LOAD (and other operations).

2813–2834

This removal means that lowering no longer does what the comment on line 2749 says the lowering does. The intent was to lower extending loads to non-extending loads to undo the formation of extending loads on targets that don't have them. It's not a widenScalar type operation here as the new result type is exactly the same as the original code. It does use a narrower intermediate which is a side effect of lowering an extending load to a non-extending load so I can kind of see the argument that it's a narrowScalar. The line is a bit blurry as a result but I'd say that if this path isn't a lower() then the other path isn't either as it too narrows scalars (and it's already on shaky ground as it doesn't lower away the input operation.

arsenm added inline comments.Jun 1 2021, 6:15 AM
/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
2813–2834

Having anything touching the load result type here in lowerload, which does not have an explicit type hint, is nonsense. Extending or non-extending loads is not a binary choice and the target needs to explicitly select the result type to use. I want to remove touching the memory type from the narrowScalar/widenScalar paths in a future change and leave any memory access size modification to a lowerLoad like function (i.e. the narrowScalar/widenScalars should select the register types to use only, and memory access size is a different problem entirely)

dsanders added inline comments.Jun 1 2021, 1:52 PM
/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
2813–2834

Extending or non-extending loads is not a binary choice

Some targets don't have extending loads (except for non-pow-2 types). Removing this code leaves me with no path to legalize G_[SZ]EXTLOAD into G_LOAD + G_[SZ]EXT that I'm aware of. I suppose an alternative path would be:

G_SEXTLOAD -> any-extending G_LOAD + G_SEXT_IN_REG -> non-extending G_LOAD + G_SEXT
G_ZEXTLOAD -> any-extending G_LOAD + G_AND ..., mask -> non-extending G_LOAD + G_ZEXT

but that path doesn't exist yet AFAIK

arsenm added inline comments.Jun 1 2021, 2:36 PM
/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
2813–2834

narrowScalar does do this (well, it emits G_SEXT/G_ZEXT which theoretically the artifact combiner should turn into sextinreg/and)

dsanders added inline comments.Jun 1 2021, 2:49 PM
/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
2813–2834

Ok, I think I see the route to getting it working again without reverting this. I'll give it a go.

Is anyone already looking into supporting narrowing the result type below the memory size?

With this change, we cannot legalize %1:_(s128) = G_ZEXTLOAD %0:_(p0) :: (load (s64)) anymore, as s32 is the only legal result type for G_ZEXTLOAD on our target.

Is anyone already looking into supporting narrowing the result type below the memory size?

With this change, we cannot legalize %1:_(s128) = G_ZEXTLOAD %0:_(p0) :: (load (s64)) anymore, as s32 is the only legal result type for G_ZEXTLOAD on our target.

I haven't double checked this but lower*() should get you to a separate s64 G_LOAD and G_ZEXT like this code used to do. Then narrowScalar*() on the G_LOAD should break it into two s32 G_LOAD's.

Is anyone already looking into supporting narrowing the result type below the memory size?

With this change, we cannot legalize %1:_(s128) = G_ZEXTLOAD %0:_(p0) :: (load (s64)) anymore, as s32 is the only legal result type for G_ZEXTLOAD on our target.

This is not a lowering, this is a narrowScalar of the result type. I'm trying to move towards narrowScalar never touching the memory type

Is anyone already looking into supporting narrowing the result type below the memory size?

With this change, we cannot legalize %1:_(s128) = G_ZEXTLOAD %0:_(p0) :: (load (s64)) anymore, as s32 is the only legal result type for G_ZEXTLOAD on our target.

This is not a lowering, this is a narrowScalar of the result type. I'm trying to move towards narrowScalar never touching the memory type

I'm confused by this. You can't narrowScalar this below the memory type without splitting into multiple loads with smaller memory types since none of the load instructions are allowed to be truncating.

Is anyone already looking into supporting narrowing the result type below the memory size?

With this change, we cannot legalize %1:_(s128) = G_ZEXTLOAD %0:_(p0) :: (load (s64)) anymore, as s32 is the only legal result type for G_ZEXTLOAD on our target.

This is not a lowering, this is a narrowScalar of the result type. I'm trying to move towards narrowScalar never touching the memory type

I'm confused by this. You can't narrowScalar this below the memory type without splitting into multiple loads with smaller memory types since none of the load instructions are allowed to be truncating.

You can narrowScalar to a regular s64 load + zext. Whatever happens to that 64-bit load is a separate question

Is anyone already looking into supporting narrowing the result type below the memory size?

With this change, we cannot legalize %1:_(s128) = G_ZEXTLOAD %0:_(p0) :: (load (s64)) anymore, as s32 is the only legal result type for G_ZEXTLOAD on our target.

This is not a lowering, this is a narrowScalar of the result type. I'm trying to move towards narrowScalar never touching the memory type

I'm confused by this. You can't narrowScalar this below the memory type without splitting into multiple loads with smaller memory types since none of the load instructions are allowed to be truncating.

You can narrowScalar to a regular s64 load + zext. Whatever happens to that 64-bit load is a separate question

Ah I'm recalling the change backwards. The change to s64 load + zext was in lower*() and narrowScalar*() and is now only in narrowScalar*(). Sorry about that, the function name in phabricator was hidden in the fold.

Is anyone already looking into supporting narrowing the result type below the memory size?

With this change, we cannot legalize %1:_(s128) = G_ZEXTLOAD %0:_(p0) :: (load (s64)) anymore, as s32 is the only legal result type for G_ZEXTLOAD on our target.

This is not a lowering, this is a narrowScalar of the result type. I'm trying to move towards narrowScalar never touching the memory type

I'm confused by this. You can't narrowScalar this below the memory type without splitting into multiple loads with smaller memory types since none of the load instructions are allowed to be truncating.

You can narrowScalar to a regular s64 load + zext. Whatever happens to that 64-bit load is a separate question

Yes that’s what I went with in the end, clamping to {s32,s64}, then legalizing the s64 G_LOAD.

Would be nice to support clamping to s32 in one step in the future though.