This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][GlobalISel] Make G_SEXT_INREG legal and add selection support.
ClosedPublic

Authored by aemerson on Jun 16 2020, 11:54 PM.

Details

Summary

We were defaulting to the lower action for this, resulting in SHL+ASHR sequences. On AArch64 we can do this in one instruction for an arbitrary extension using SBFM as we do for G_SEXT.

Diff Detail

Event Timeline

aemerson created this revision.Jun 16 2020, 11:54 PM
This revision is now accepted and ready to land.Jun 18 2020, 4:56 PM
This revision was automatically updated to reflect the committed changes.

It looks like this patch is causing incorrect code generation for

signed int x = -0x8000;
int fn(int *ptr) {
    return ptr[(signed short)(unsigned short)(x - 0x8000)];
}

we have initially

%6:_(s32) = nsw G_SUB %3:_, %5:_
%7:_(s16) = G_TRUNC %6:_(s32)
%8:_(s64) = G_SEXT %7:_(s16)

which then gets converted to

%6:_(s32) = nsw G_SUB %3:_, %5:_
%16:_(s64) = G_ANYEXT %6:_(s32)
%8:_(s64) = G_SEXT_INREG %16:_, 16

The truncation has been lost, and we eventually end up with

subs    w9, w9, #8, lsl #12
mov     w0, w9
ldr     w0, [x8, w0, uxtw #2]
%6:_(s32) = nsw G_SUB %3:_, %5:_
%7:_(s16) = G_TRUNC %6:_(s32)
%8:_(s64) = G_SEXT %7:_(s16)

which then gets converted to

%6:_(s32) = nsw G_SUB %3:_, %5:_
%16:_(s64) = G_ANYEXT %6:_(s32)
%8:_(s64) = G_SEXT_INREG %16:_, 16

The truncation has been lost, and we eventually end up with

The anyext is fine there, the SEXT_INREG should do the right thing and sign extend from bit 16, however it looks like we have a bug in the selector in this case. Fix incoming.

%6:_(s32) = nsw G_SUB %3:_, %5:_
%7:_(s16) = G_TRUNC %6:_(s32)
%8:_(s64) = G_SEXT %7:_(s16)

which then gets converted to

%6:_(s32) = nsw G_SUB %3:_, %5:_
%16:_(s64) = G_ANYEXT %6:_(s32)
%8:_(s64) = G_SEXT_INREG %16:_, 16

The truncation has been lost, and we eventually end up with

The anyext is fine there, the SEXT_INREG should do the right thing and sign extend from bit 16, however it looks like we have a bug in the selector in this case. Fix incoming.

I've pushed a fix in 97a34b5f8d2e, thanks for the reproducer.