This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][GlobalISel] Zero-extend s1 values when returning.
ClosedPublic

Authored by aemerson on May 27 2018, 9:14 AM.

Details

Summary

Before we were relying on the any extend of the s1 to s32, but for AAPCS we need to zero-extend it to at least s8.

Fixes PR36719

Diff Detail

Repository
rL LLVM

Event Timeline

aemerson created this revision.May 27 2018, 9:14 AM

I got my branches mixed up, this patch is missing test updates. Will upload a new one soon.

rtereshin added inline comments.
lib/CodeGen/GlobalISel/LegalizerHelper.cpp
756

I don't think this is the right way of achieving this and not the way we seem to be currently handling G_CONSTANTs signs across the Global ISel either.

The IR ConstantInt that is a part of G_CONSTANT's operand is basically a bag of bits, if it gets extended, the extend is pretty much an anyext, as different usages of it may need different extensions / may interpret the bits differently.

Therefore every use that is not happy with anyext is explicitly extending the value the right way.

Try ./bin/llc -O0 -mtriple aarch64-- div.ll -o - -stop-after=legalizer -simplify-mir on the following snippets for example:

define i8 @udiv_with_constants(i8 %a) {
entry:
  %res = udiv i8 %a, 250
  ret i8 %res
}
define i8 @sdiv_with_constants(i8 %a) {
entry:
  %res = sdiv i8 %a, 250
  ret i8 %res
}

There are other instructions that are sensitive to the extensions of constants, like all the shifts, for instance. They solve this problem the same way as divisions / remainders.

Technically, the same constant may have more than one usage that would interpret it differently WRT to sign as well.

So this is ABI boundary problem, not legalizing G_CONSTANTs problem, IMO.

I think we need to explicitly extend / mask values before storing them into memory / after loading them from memory, before or after returning them from functions / passing them into functions.

i1s are also a little special. The comment to TargetLowering::getBooleanContents suggests that it's only meaningful for i1 produced by a limited set of instructions and "not to be confused with general values promoted from i1", that's probably why some targets prefer to call such instructions producing i1 legal as soon as those values are immediately used by another instructions from the same subset of special instructions, and then just select the whole thing.

In other words, the flow of i1s is limited.

I think for the test case included this needs to be a change to AArch64CallLowering::lowerReturn method instead.

+ @aditya_nandakumar
+ @dsanders

rtereshin added inline comments.May 29 2018, 1:59 PM
test/CodeGen/AArch64/GlobalISel/legalize-bool.mir
7

Defining this function with zeroext return value attribute, like test/CodeGen/AArch64/arm64-xaluo.ll does for instance, seems to fix the problem w/ no changes to the compiler.

29

The input explicitly says anyext s1 to s32, it doesn't looks like it's legalizer's problem to fix this.

dsanders added inline comments.May 29 2018, 2:35 PM
lib/CodeGen/GlobalISel/LegalizerHelper.cpp
756

There's actually two issues here:

  1. How do we widen scalars?
  2. How do we match the ABI?

For widening scalars, the answer is that we any-extend. However, we can't actually extend with undefined bits so we've been using sext as a placeholder. G_TRUNC/G_SEXT/G_ZEXT can then update the constant as they get optimized. Some targets find it easier to materialize signed immediates and some find unsigned immediates easier so I can see some value in asking the target which it wants to do. That's more of an optimization thing rather than an ABI correctness thing though and it's not really connected with the representation of s1.

For matching the ABI, targets that implement the signext and zeroext attributes can use them to ensure that LLVM inserts a G_SEXT/G_ZEXT as appropriate to extend the returned result. In principle, I think we should be using that. In practice however, most targets have been getting away with not using those attributes for a very long time (instead, they're just consistent about the extend they use) and we're not going to be a drop-in replacement if we don't behave the same way as before without them. I think the right place to drive this in AArch64CallLowering::lowerReturn(). If that code injects an appropriate extend then the ABI will be correct.

rtereshin added inline comments.
lib/CodeGen/GlobalISel/LegalizerHelper.cpp
756

Agreed. I would like to add also, that judging from this https://reviews.llvm.org/D44410#1036451 comment TargetLowering::getBooleanContents might not be the right way to ask a target about its preferences even optimization-wise.

Now I'm worried about the usage of that method in our G_STORE legalization code, looks like we should be always zero-extending when storing to memory. +@efriedma

aemerson added inline comments.May 29 2018, 3:00 PM
lib/CodeGen/GlobalISel/LegalizerHelper.cpp
756

We can't look at more than one instruction when we legalise, so won't we need to always zero extend i1s?

dsanders added inline comments.May 29 2018, 3:04 PM
lib/CodeGen/GlobalISel/LegalizerHelper.cpp
756

Now I'm worried about the usage of that method in our G_STORE legalization code, looks like we should be always zero-extending when storing to memory

That one is directly dealing with ABI issues so it seems correct to me that it's asking the target which extend it wants to use for the store (that doesn't mean that getBooleanContents() is the right function for the query though). clang should be inserting code to convert it to 0 or 1 when it needs the value to be correct w.r.t C/C++.

We can't look at more than one instruction when we legalise, so won't we need to always zero extend i1s?

Could you elaborate on this? I believe we're only looking at one instruction at a time.

aemerson added inline comments.May 29 2018, 3:20 PM
lib/CodeGen/GlobalISel/LegalizerHelper.cpp
756

Yes never mind I was confused with what we were currently doing with our store legalisation.

I can't find any place where the targets have specified before which kind of extension they want for stores.

If possible, GlobalISel's handling of booleans should match SelectionDAG's handling:

  1. Legalization for ISD::STORE always zero-extends integers which are not byte-sized (see SelectionDAGLegalize::LegalizeStoreOps).
  2. getBooleanContents() specifies the extension of the result of various arithmetic operations which produce and consume "boolean" values; on the producing side, SETCC/SADDO/etc., and on the consuming side, SELECT/BRCOND/ADDCARRY/etc. This is documented in ISDOpcodes.
  3. i1 values produced by operations which are not boolean operations don't need to have any particular kind of extension.

The third rule is arguably weird, but it avoids scattering special cases for i1 across legalization. I don't see any compelling reason for GlobalISel to diverge.

The ABI rule for booleans on AArch64 is a little strange. The value must be passed/returned as if it were converted to unsigned char, but it doesn't need to be zero-extended beyond 8 bits. (See examples at https://bugs.llvm.org/show_bug.cgi?id=36719 .) This should be handled somewhere in the ABI code, not legalization.

I'll re-do this patch to unconditionally zero-extend for returns and also fix up the G_STORE legalisation to always zero-extend too. Given we don't have any target hooks that give us the information we need, let's go with just matching SelectionDAG behaviour?

As for the zeroext attributes, from what I can tell we've had the distinction between Darwin and non-Darwin with this since the beginning of the AArch64 target, which was contributed by @t.p.northover. I'd rather not be messing with that unless I have to.

aemerson updated this revision to Diff 149282.May 31 2018, 6:50 AM
aemerson retitled this revision from [GlobalISel][Legalizer] Fix i1s being sign extended instead of zero-extended to [AArch64][GlobalISel] Zero-extend s1 values when returning..
aemerson edited the summary of this revision. (Show Details)

New patch now only zero-extends for stores, removing the use of getBooleanContents.

We already have test coverage for the store zero-extension.

The return issue is fixed in AArch64CallLowering instead.

rtereshin accepted this revision.May 31 2018, 8:36 AM

Please take a look at the small nitpick, otherwise LGTM. Thanks!

lib/Target/AArch64/AArch64CallLowering.cpp
237 ↗(On Diff #149282)

I think the preferred way of doing this would be VReg = MIRBuilder.buildZExt(LLT::scalar(8), VReg)->getOperand(0).getReg(); just to avoid the explicit MachineRegisterInfo::createGenericVirtualRegister call. +@aditya_nandakumar

This revision is now accepted and ready to land.May 31 2018, 8:36 AM
efriedma added inline comments.May 31 2018, 11:24 AM
lib/Target/AArch64/AArch64CallLowering.cpp
224 ↗(On Diff #149282)

Do you also need to modify argument passing, to handle something like void a(void x(_Bool)) { x(1); }?

lib/Target/AArch64/AArch64CallLowering.cpp
237 ↗(On Diff #149282)

That would be my preference as well.

aemerson added inline comments.Jun 1 2018, 6:24 AM
lib/Target/AArch64/AArch64CallLowering.cpp
224 ↗(On Diff #149282)

That looks to work already, we zero extend for that case:

%0:_(p0) = COPY $x0
%3:_(s1) = G_CONSTANT i1 true
%1:_(p0) = G_FRAME_INDEX %stack.0.x.addr
G_STORE %0(p0), %1(p0) :: (store 8 into %ir.x.addr)
%2:gpr64(p0) = G_LOAD %1(p0) :: (load 8 from %ir.x.addr)
ADJCALLSTACKDOWN 0, 0, implicit-def $sp, implicit $sp
%4:_(s32) = G_ZEXT %3(s1)
$w0 = COPY %4(s32)
BLR %2(p0), csr_aarch64_aapcs, implicit-def $lr, implicit $sp, implicit $w0
ADJCALLSTACKUP 0, 0, implicit-def $sp, implicit $sp
RET_ReallyLR
237 ↗(On Diff #149282)

Will do, thanks.

This revision was automatically updated to reflect the committed changes.