This is an archive of the discontinued LLVM Phabricator instance.

[TargetLowering][X86] Teach SimplifyDemandedBits to use ShrinkDemandedOp on ISD::SHL nodes.
ClosedPublic

Authored by craig.topper on Apr 5 2019, 9:51 PM.

Details

Summary

If the upper bits of the SHL result aren't used, we might be able to use a narrower shift. For example, on X86 this can turn a 64-bit into 32-bit enabling a smaller encoding.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Apr 5 2019, 9:51 PM
RKSimon added inline comments.Apr 9 2019, 1:19 AM
llvm/test/CodeGen/X86/zext-logicop-shift-load.ll
26 ↗(On Diff #194208)

Are you looking at this as a followup?

spatel added inline comments.Apr 9 2019, 6:15 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11041–11045 ↗(On Diff #194208)

It would be better to split this change off on its own while adding a test specifically for this pattern.

IIUC, we can modify the existing test slightly and show the missing fold:

declare void @t()
define void @tbz_zext(i32 %in) {
  %shl = shl i32 %in, 3
  %t = zext i32 %shl to i64
  %and = and i64 %t, 32
  %cond = icmp eq i64 %and, 0
  br i1 %cond, label %then, label %end

then:
  call void @t()
  br label %end

end:
  ret void
}

Remove the AArch64 code change. Show the regression instead. I'll work on the separate patch and rebase accordingly depending on what order they get committed

I think D60482 should go in 1st, so we avoid that known regression. There's still an open question about the x86 LEA matching. I've seen that or similar matching failures in other tests, so it would be nice to catch it first too.

I wonder if losing the wrapping flags is hurting. Although we should be able to use knownbits to restore the knowledge. Something like this?

define i64 @lea(i64 %t0, i32 %t1) {
  %t4 = add nuw nsw i32 %t1, 8
  %sh = shl nsw i32 %t4, 2
  %t5 = zext i32 %sh to i64
  %t6 = add i64 %t5, %t0
  ret i64 %t6
}

Produces:
leal 32(,%rsi,4), %eax
addq %rdi, %rax

Instead of:
leaq 32(%rdi,%rsi,4), %rax

For the LEA regression here, we need to teach foldMaskedShiftToScaledMask to look through the any_extend to find the shift and reinsert the any_extend in the new ordering.

craig.topper retitled this revision from [TargetLowering][X86][AArch64] Teach SimplifyDemandedBits to use ShrinkDemandedOp on ISD::SHL nodes. to [TargetLowering][X86] Teach SimplifyDemandedBits to use ShrinkDemandedOp on ISD::SHL nodes..
craig.topper removed a reviewer: t.p.northover.
craig.topper edited the summary of this revision. (Show Details)
spatel accepted this revision.Apr 11 2019, 3:55 PM

LGTM

This revision is now accepted and ready to land.Apr 11 2019, 3:55 PM
This revision was automatically updated to reflect the committed changes.

I just pushed rGd8b9ed72ee83, a testcase affected by this patch.