This is an archive of the discontinued LLVM Phabricator instance.

[X86][BMI][TBM] Only demand bottom 16-bits of the BEXTR control op (PR34042)
ClosedPublic

Authored by RKSimon on Jun 3 2018, 12:12 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Jun 3 2018, 12:12 PM
craig.topper added inline comments.Jun 3 2018, 12:24 PM
lib/Target/X86/X86InstrCompiler.td
2047 ↗(On Diff #149654)

Why aren't these load patterns redundant now?

lebedev.ri added inline comments.Jun 3 2018, 1:24 PM
lib/Target/X86/X86InstrCompiler.td
2051 ↗(On Diff #149654)

Why SExt; ahouldn't ZExt be fine, especially since only low 16 bits will be used?

craig.topper added inline comments.Jun 3 2018, 2:47 PM
lib/Target/X86/X86InstrCompiler.td
2051 ↗(On Diff #149654)

Its consistent with the behavior of the gnu assembler and basically any other 64 bit x86 instruction that only encodes a 32 bit immediate. Doing something else would require a new assembly parser method specifically for this instruction.

Hilariously, the tbm-builtins.c test in clang fails -O0 isel on one of the TBM bextr tests because the immediate is too large.

RKSimon updated this revision to Diff 149704.Jun 4 2018, 3:41 AM

Remove remaining (unnecessary) TBM X86bextr patterns

RKSimon marked an inline comment as done.Jun 4 2018, 3:42 AM

Hilariously, the tbm-builtins.c test in clang fails -O0 isel on one of the TBM bextr tests because the immediate is too large.

I'm not seeing this with the latest version of my patch - do you still have a repro?

It fails on trunk. Your patch probably fixes it by shrinking the immediate. Is there anything that can prevent the DAG combine from running that we should be concerned about? If there is we might not need to clamp the immediate in lowering or isel.

It fails on trunk. Your patch probably fixes it by shrinking the immediate. Is there anything that can prevent the DAG combine from running that we should be concerned about? If there is we might not need to clamp the immediate in lowering or isel.

I don't think TargetLowering::SimplifyDemandedBits alters constants so I might not be repro'ing it properly - my X86ISelDAGToDAG.cpp change might be causing it to fallback to the reg-reg version?

What was the full repro command line please?

Ok its not the exact tbm-builtins.c test after all I think I had modified it when I was testing. This should fail

#include <x86intrin.h>

unsigned long long test__bextri_u64_bigint(unsigned long long a) {
  return __bextri_u64(a, 0x7ffffff00LL);
}

Well 0x7ffffffffLL fails with -O0. https://godbolt.org/g/AC16ui I had to pick 6.0.0 because trunk segfaulted on anything, but I think that must be a problem with godbolt itself.

RKSimon updated this revision to Diff 149956.Jun 5 2018, 5:38 AM

Perform a demanded bits style mask on BEXTR constant masks to fix the existing i32 sign-extend isel issue

craig.topper accepted this revision.Jun 5 2018, 5:09 PM

LGTM. I'll let you decide what to do with the APInts.

lib/Target/X86/X86ISelLowering.cpp
36835 ↗(On Diff #149956)

nit. this can be const APInt &Val.

36836 ↗(On Diff #149956)

You can use Val1 & 0xffff if you want. There's an operator& that takes an APInt and uint64_t.

Or you could do something like.

if (Val1.ugt(0xFFFF)
  return DAG.getNode(X86ISD::BEXTR, SDLoc(N), VT, Op0,
                         DAG.getConstant(Val1 & 0xffff, SDLoc(N), VT));
This revision is now accepted and ready to land.Jun 5 2018, 5:09 PM
This revision was automatically updated to reflect the committed changes.