Page MenuHomePhabricator

[InstCombine] fold a shifted zext to a select
Needs ReviewPublic

Authored by zvi on Jun 15 2019, 4:47 PM.

Details

Reviewers
spatel
nikic
Summary

For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)

Fixes pr42257

Diff Detail

Event Timeline

zvi created this revision.Jun 15 2019, 4:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2019, 4:47 PM
Herald added a subscriber: hiraditya. · View Herald Transcript

Don't know if this is the canonicalization we want, but some notes

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
600

vectors of i1 too (isIntOrIntVectorTy(1))

601

this will overflow, you want

ConstantInt::get(Ty, APInt(Ty->getScalarType()->getBitWidth(), 1) << ShAmt)

or something

zvi added a subscriber: zvi.Jun 16 2019, 1:30 AM
zvi updated this revision to Diff 204943.Jun 16 2019, 2:25 AM

Addressed comments by @lebedev.ri
Thanks!

zvi marked 2 inline comments as done.Jun 16 2019, 2:27 AM
lebedev.ri added inline comments.Jun 16 2019, 2:37 AM
llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
661–662

I think you want to add the fold here, no reason to restrict it to identical vectors

zvi marked an inline comment as done.Jun 16 2019, 3:51 AM
zvi added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
661–662

What is the policy for handling non-splat cases, should they always be handled when possible or are they limited only to certain cases to reduce compile-time?

zvi updated this revision to Diff 204944.Jun 16 2019, 4:19 AM

Following suggestion by @lebedev.ri to support non-splat vector shift amount

xbolva00 added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
675

LHS/RHS dont match the comment

zvi updated this revision to Diff 204955.Jun 16 2019, 9:53 AM

Fix typo in comment pointed out by @xbolva00

zvi edited the summary of this revision. (Show Details)Jun 16 2019, 9:53 AM
zvi marked 2 inline comments as done.

IMO, this is the right canonicalization for IR because it's the smallest form. Also if the code was already in this form, it might have profile data on the select condition that would benefit codegen (for example, we might want to compare and branch on this instead of turning it into bit-logic). I've made several related changes to prefer 'select' in IR over bithacks.

The only problem that I see is that the backend (DAGCombiner) isn't prepared to generically reverse this pattern yet (x86 might have some code that can be lifted):

define i32 @shl_zext_bool(i1 %t) {
  %ext = zext i1 %t to i32
  %shl = shl i32 %ext, 7
  ret i32 %shl
}

define i32 @sel_zext_bool(i1 %t) {
  %shl= select i1 %t, i32 128, i32 0
  ret i32 %shl
}
$ ./llc -o - selsh.ll -mtriple=aarch64--
shl_zext_bool:                          // @shl_zext_bool
	and	w8, w0, #0x1
	lsl	w0, w8, #7
	ret

sel_zext_bool:                          // @sel_zext_bool
	tst	w0, #0x1
	mov	w8, #128
	csel	w0, w8, wzr, ne
	ret
llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
661–662

There's no official policy AFAIK. If it's easy enough to apply the transform to an arbitrary vector constant, then I'd prefer we do it.

There should not be much compile-time overhead because (1) either way, we're checking each element of the vector constant to determine if it's a splat and (2) there probably aren't that many non-splat vector constants to begin with.

zvi marked an inline comment as done.Jun 17 2019, 5:07 PM

The only problem that I see is that the backend (DAGCombiner) isn't prepared to generically reverse this pattern yet (x86 might have some code that can be lifted):

combineSelectOfTwoConstants seems to be the handler for X86 you mentioned . I like your idea to migrate it to a target-independent combine.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
661–662

This makes sense, thanks.