This is an archive of the discontinued LLVM Phabricator instance.

[VE] Change the behaviour of truncate
ClosedPublic

Authored by kaz7 on Nov 27 2020, 6:10 AM.

Details

Summary

Change the way to truncate i64 to i32 in I64 registers. VE assumed
sext values previously. Change it to zext values this time to make
it match to the LLVM behaviour.

Diff Detail

Event Timeline

kaz7 created this revision.Nov 27 2020, 6:10 AM
kaz7 requested review of this revision.Nov 27 2020, 6:10 AM

I understand this is necessary for VE ABI compliance. Is there anything in LLVM that says a truncate has to zero-out all leading bits?

kaz7 added a comment.EditedNov 27 2020, 2:54 PM

I understand this is necessary for VE ABI compliance. Is there anything in LLVM that says a truncate has to zero-out all leading bits?

Very good question I think. I've not found such documentations. But, it looks like LLVM assumes to zero-out all leading bits after truncate. For example, we have following optimization in DAGCombiner and this requires zero-out where logic_op is XOR.

// logic_op (truncate x), (truncate y) --> truncate (logic_op x, y)

I also see Mips64InstrInfo.td tries to one-out all leading bits after truncate (EDIT: this may not fit as an example of zero-out):

// truncate
def : MipsPat<(trunc (assertsext GPR64:$src)),
              (EXTRACT_SUBREG GPR64:$src, sub_32)>, ISA_MIPS3, GPR_64;
// The forward compatibility strategy employed by MIPS requires us to treat
// values as being sign extended to an infinite number of bits. This allows
// existing software to run without modification on any future MIPS
// implementation (e.g. 128-bit, or 1024-bit). Being compatible with this
// strategy requires that truncation acts as a sign-extension for values being
// fed into instructions operating on 32-bit values. Such instructions have
// undefined results if this is not true.
// For our case, this means that we can't issue an extract_subreg for nodes
// such as (trunc:i32 (assertzext:i64 X, i32)), because the sign-bit of the
// lower subreg would not be replicated into the upper half.
def : MipsPat<(trunc (assertzext_lt_i32 GPR64:$src)),
              (EXTRACT_SUBREG GPR64:$src, sub_32)>, ISA_MIPS3, GPR_64;
def : MipsPat<(i32 (trunc GPR64:$src)),
              (SLL (EXTRACT_SUBREG GPR64:$src, sub_32), 0)>, ISA_MIPS3, GPR_64;

Regarding to VE ABI. VE ABI is compilicated. It requires one-out for signed values and zero-out for unsigned values. Therefore, I made LLVM for VE uses one-out before (I was thinking about signed values only at that time). Recently, I modify clang for VE to pass all arguments in 64 bits and this fulfills VE's ABI. Therefore, I decide to change LLVM for VE to use zero-out by default.

simoll accepted this revision.Nov 30 2020, 2:28 AM

Regarding to VE ABI. VE ABI is compilicated. It requires one-out for signed values and zero-out for unsigned values. Therefore, I made LLVM for VE uses one-out before (I was thinking about signed values only at that time). Recently, I modify clang for VE to pass all arguments in 64 bits and this fulfills VE's ABI. Therefore, I decide to change LLVM for VE to use zero-out by default.

Fair enough, then.

This revision is now accepted and ready to land.Nov 30 2020, 2:28 AM
This revision was automatically updated to reflect the committed changes.