⚙ D149001 [InstSimplify] sdiv a (1 srem b) --> a

floatshadow created this revision.Apr 22 2023, 11:14 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2023, 11:14 AM

Herald added subscribers: hoy, StephenFan, hiraditya. · View Herald Transcript

floatshadow requested review of this revision.Apr 22 2023, 11:14 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2023, 11:14 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

floatshadow added reviewers: spatel, xbolva00.Apr 22 2023, 11:20 AM

This is missing a lot of conjugate patterns. You can do the same for udiv and there are related patterns for urem and srem as well. You want to extend this code: https://github.com/llvm/llvm-project/blob/1eb74f7e83ffb3f9d00e5987cead3b12e00bbe82/llvm/lib/Analysis/InstructionSimplify.cpp#L1053-L1061

This revision now requires changes to proceed.Apr 22 2023, 11:40 AM

Harbormaster completed remote builds in B227473: Diff 516089.Apr 22 2023, 12:01 PM

now this patch will figure out other conjugate patterns like [u]sdiv X (1 [u]srem Y) --> X and [u]srem X (1 [u]srem Y) --> 0

junaire added a subscriber: junaire.Apr 22 2023, 7:39 PM

junaire added inline comments.

llvm/test/Transforms/InstSimplify/div.ll
437 ↗	(On Diff #516108)	split the tests into a separate revision so we can see the diff more clearly. (You can add a parent revision

Harbormaster completed remote builds in B227484: Diff 516108.Apr 22 2023, 7:43 PM

floatshadow edited the summary of this revision. (Show Details)Apr 22 2023, 8:53 PM

floatshadow mentioned this in D149008: [InstSimplify] Test for D149001.Apr 22 2023, 9:13 PM

floatshadow added a child revision: D149008: [InstSimplify] Test for D149001.

floatshadow removed a child revision: D149008: [InstSimplify] Test for D149001.

floatshadow added a parent revision: D149008: [InstSimplify] Test for D149001.

clean diff following what junaire proposes

Depends on D149012

floatshadow edited parent revisions, added: D149012: [InstSimplify] Test case for D149001; removed: D149008: [InstSimplify] Test for D149001.Apr 23 2023, 12:35 AM

Thanks! This will still a miss other patterns that can only be zero or one. For example a mask with 1: https://alive2.llvm.org/ce/z/A_ffYe

You can handle these by calling computeKnownBits() and checking the number of leading zeros.

This revision now requires changes to proceed.Apr 23 2023, 12:52 AM

Harbormaster completed remote builds in B227500: Diff 516129.Apr 23 2023, 1:29 AM

use computeKnownBits() to check if the divisor can only be zero or one.

another question is can this method replace the original pattern match like match(Op1, m_One())? I worry about the performance of computeKnownBits().

Harbormaster completed remote builds in B227516: Diff 516148.Apr 23 2023, 4:13 AM

@nikic I checked PR51762 test in Transforms/InstCombine/zext-or-icmp.ll

%lor.ext = zext i1 %spec.select57 to i32
%t2 = load i32, ptr %d, align 4
%conv15 = sext i16 %t1 to i32
%cmp16 = icmp sge i32 %t2, %conv15
%conv17 = zext i1 %cmp16 to i32
%t3 = load i32, ptr %f, align 4
%add = add nsw i32 %t3, %conv17
store i32 %add, ptr %f, align 4
%rem18 = srem i32 %lor.ext, %add
%conv19 = zext i32 %rem18 to i64

%div = udiv i64 %insert.insert41, %conv19
%trunc33 = trunc i64 %div to i32
store i32 %trunc33, ptr %d, align 8
%r = icmp ult i64 %insert.insert41, %conv19
call void @llvm.assume(i1 %r)

as %cond19 = zext (srem (zext i1) %add), %cond can only be zero or one
the main branch looks at llvm.assume, and deduce %insert.insert41 = 0, thus store i32 %trunc33, ptr %d, align 8 becomes store i32 0, ptr %d, align 8
I do some trace which shows the call stack: simplifyUdiv --> isDivZero -> simplifyICmpWithDominatingAssume

as for current patch, it seems udiv will first enter simplifyDivRem which do the fold, replace %udiv with %insert.insert41 (done by computeKnownBits).
later trunc replace %insert.insert41 with %insert.insert39: opt debug logs

ADD DEFERRED:   %insert.insert41 = or i64 %insert.shift52, %insert.ext39
IC: Mod =   %trunc33 = trunc i64 %insert.insert41 to i32
    New =   %trunc33 = trunc i64 %insert.ext39 to i32

thus store i32 %trunc33, ptr %d, align 8 becomes store i32 %sroa38, ptr %d, align 8

I am not sure what I should do now, alive2 tells me that the transforms seems correct https://alive2.llvm.org/ce/z/NfVsmT

In D149001#4290441, @floatshadow wrote:

another question is can this method replace the original pattern match like match(Op1, m_One())? I worry about the performance of computeKnownBits().

It's okay to replace in this case. Division instructions are very rare, so this is not particularly performance sensitive.

I am not sure what I should do now, alive2 tells me that the transforms seems correct https://alive2.llvm.org/ce/z/NfVsmT

You can just regenerate the test. It's a fuzzer generated test case, so we don't really care as long as it doesn't infinite loop / miscompile.

address comments

remove old pattern matching m_One() and m_ZExt()
regenerate test PR51762 test in Transforms/InstCombine/zext-or-icmp.ll

Harbormaster completed remote builds in B227579: Diff 516223.Apr 23 2023, 8:49 PM

This looks good to me, but it looks like you need to regenerate some clang OpenMP tests as well. Not sure why/where those run InstSimplify, but the failures look legitimate.