This is an archive of the discontinued LLVM Phabricator instance.

[DAG] Simplify OR(X,SHL(Y,BW/2)) eq/ne 0/-1 'all/any-of' style patterns
ClosedPublic

Authored by RKSimon on Dec 20 2020, 8:58 AM.

Details

Summary

Attempt to simplify all/any-of style patterns that concatenate 2 smaller integers together into an and(x,y)/or(x,y) + icmp 0/-1 instead.

This is mainly to help some bool predicate reduction patterns where we end up concatenating bool vectors that have been bitcasted to integers.

Diff Detail

Event Timeline

RKSimon created this revision.Dec 20 2020, 8:58 AM
RKSimon requested review of this revision.Dec 20 2020, 8:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 20 2020, 8:58 AM
yubing added a subscriber: yubing.Dec 21 2020, 12:30 AM
RKSimon retitled this revision from [DAG] Simplify icmp(EQ/NE, OR(X,SHL(Y,BW/2), 0/-1) patterns to [DAG] Simplify OR(X,SHL(Y,BW/2)) eq/ne 0/-1 'all/any-of' style patterns.Dec 22 2020, 6:25 AM
RKSimon edited the summary of this revision. (Show Details)
yubing added inline comments.Dec 26 2020, 5:11 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3980–3981

Hi, Simon. It seems that following combine is incorrect?
t16: i1 = setcc t34, Constant:i64<-1>, seteq:ch
t34: i64 = or t30, t33
t33: i64 = shl t31, Constant:i8<32>
into:
t35: i64 = Constant<4294967295>
t36: i64 = and t31, Constant:i64<4294967295>
t37: i64 = and t30, t36
t38: i1 = setcc t37, Constant:i64<4294967295>, seteq:ch

Before combine, t16 is equal to or(x,shl(y,32)) == -1(i64)
But aftercombine, t38 is equal to and(x_low, y_low) ==-1(i32)

This happens in allones_v64i8_and4 with core-avx2

RKSimon added inline comments.Dec 26 2020, 7:22 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3980–3981

I'm sorry but I'm not clear what issue you're saying I've missed in allones_v64i8_and4? Being able to reduce the allbits comparison from i64 to i32 is what we're trying to accomplish.

We check that 'x' (LHS) is zero in the upper bits, so we know the upper half of the compare bits come from 'y_low' and the lower half from 'x_low'.

yubing added inline comments.Dec 27 2020, 7:04 PM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3980–3981

Ah, sorry, I should have seen we check if upper half of x is zero.

any more comments?

llvm/test/CodeGen/X86/movmsk-cmp.ll
2668–2669

BTW - we should be able to catch these concats as well by extending this patch's approach in a future commit

RKSimon updated this revision to Diff 314580.Jan 5 2021, 5:42 AM

Improve handling of AND/OR(OR(X2,SHL(Y2,BW/2)) , OR(X1,SHL(Y1,BW/2)) ) style patterns to fix remaining movmsk patterns

spatel added a comment.Jan 5 2021, 9:07 AM

Does this work with scalars too? Let me know if I missed the test(s).

I'm imagining something like this:
https://alive2.llvm.org/ce/z/JzcBoD

(Seems like we'd want to do that in IR too?)

RKSimon updated this revision to Diff 314853.Jan 6 2021, 4:08 AM

Improve scalar test coverage

spatel added inline comments.Jan 6 2021, 9:03 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3960

Add an example to this comment to make it clearer which patterns we are handling:

// When high 32-bits of i64 X are known clear:
// all bits clear: (X | (Y<<32)) ==  0 --> (X | Y) ==  0
// all bits set:   (X | (Y<<32)) == -1 --> (X & Y) == -1
3970–3971

I think we should check for the opcode + shift amount match in one shot. Otherwise, we can miss patterns that are identical other than the commuted or operands:

define i1 @shl_shl(i16 %x, i16 %y) {
  %zx = zext i16 %x to i64
  %zy = zext i16 %y to i64
  %sx = shl i64 %zx, 32
  %sy = shl i64 %zy, 8
  %or = or i64 %sx, %sy
  %r = icmp eq i64 %or, 0
  ret i1 %r
}

define i1 @shl_shl_commute(i16 %x, i16 %y) {
  %zx = zext i16 %x to i64
  %zy = zext i16 %y to i64
  %sx = shl i64 %zx, 32
  %sy = shl i64 %zy, 8
  %or = or i64 %sy, %sx
  %r = icmp eq i64 %or, 0
  ret i1 %r
}
RKSimon updated this revision to Diff 314932.Jan 6 2021, 10:10 AM

Add commuted pattern handling

RKSimon updated this revision to Diff 314936.Jan 6 2021, 10:35 AM

Add better comment suggested by @spatel

spatel accepted this revision.Jan 6 2021, 10:37 AM

LGTM - see inline for possible missed comment.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3960

I still think it would be nice to add a code comment like this. This code block is big enough that it's hard to recognize the whole patterns that we are transforming.

This revision is now accepted and ready to land.Jan 6 2021, 10:37 AM