This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] sdiv a (1 srem b) --> a
ClosedPublic

Authored by floatshadow on Apr 22 2023, 11:14 AM.

Details

Summary

Try to fix issue https://github.com/llvm/llvm-project/issues/62163

as (1 srem X) is not equivalent with zext (X != 1), pattern sdiv a (1 srem b) miss the optimization.

Diff Detail

Event Timeline

floatshadow created this revision.Apr 22 2023, 11:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2023, 11:14 AM
floatshadow requested review of this revision.Apr 22 2023, 11:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2023, 11:14 AM
nikic requested changes to this revision.Apr 22 2023, 11:40 AM

This is missing a lot of conjugate patterns. You can do the same for udiv and there are related patterns for urem and srem as well. You want to extend this code: https://github.com/llvm/llvm-project/blob/1eb74f7e83ffb3f9d00e5987cead3b12e00bbe82/llvm/lib/Analysis/InstructionSimplify.cpp#L1053-L1061

This revision now requires changes to proceed.Apr 22 2023, 11:40 AM
floatshadow updated this revision to Diff 516108.EditedApr 22 2023, 6:58 PM
floatshadow edited the summary of this revision. (Show Details)
floatshadow removed rG LLVM Github Monorepo as the repository for this revision.

now this patch will figure out other conjugate patterns like [u]sdiv X (1 [u]srem Y) --> X and [u]srem X (1 [u]srem Y) --> 0

junaire added inline comments.
llvm/test/Transforms/InstSimplify/div.ll
437

split the tests into a separate revision so we can see the diff more clearly. (You can add a parent revision

floatshadow edited the summary of this revision. (Show Details)Apr 22 2023, 8:53 PM
floatshadow edited the summary of this revision. (Show Details)

clean diff following what junaire proposes

Depends on D149012

nikic requested changes to this revision.Apr 23 2023, 12:52 AM

Thanks! This will still a miss other patterns that can only be zero or one. For example a mask with 1: https://alive2.llvm.org/ce/z/A_ffYe

You can handle these by calling computeKnownBits() and checking the number of leading zeros.

This revision now requires changes to proceed.Apr 23 2023, 12:52 AM

use computeKnownBits() to check if the divisor can only be zero or one.

another question is can this method replace the original pattern match like match(Op1, m_One())? I worry about the performance of computeKnownBits().

floatshadow added a comment.EditedApr 23 2023, 7:56 AM

@nikic I checked PR51762 test in Transforms/InstCombine/zext-or-icmp.ll

%lor.ext = zext i1 %spec.select57 to i32
%t2 = load i32, ptr %d, align 4
%conv15 = sext i16 %t1 to i32
%cmp16 = icmp sge i32 %t2, %conv15
%conv17 = zext i1 %cmp16 to i32
%t3 = load i32, ptr %f, align 4
%add = add nsw i32 %t3, %conv17
store i32 %add, ptr %f, align 4
%rem18 = srem i32 %lor.ext, %add
%conv19 = zext i32 %rem18 to i64

%div = udiv i64 %insert.insert41, %conv19
%trunc33 = trunc i64 %div to i32
store i32 %trunc33, ptr %d, align 8
%r = icmp ult i64 %insert.insert41, %conv19
call void @llvm.assume(i1 %r)

as %cond19 = zext (srem (zext i1) %add), %cond can only be zero or one
the main branch looks at llvm.assume, and deduce %insert.insert41 = 0, thus store i32 %trunc33, ptr %d, align 8 becomes store i32 0, ptr %d, align 8
I do some trace which shows the call stack: simplifyUdiv --> isDivZero -> simplifyICmpWithDominatingAssume

as for current patch, it seems udiv will first enter simplifyDivRem which do the fold, replace %udiv with %insert.insert41 (done by computeKnownBits).
later trunc replace %insert.insert41 with %insert.insert39: opt debug logs

ADD DEFERRED:   %insert.insert41 = or i64 %insert.shift52, %insert.ext39
IC: Mod =   %trunc33 = trunc i64 %insert.insert41 to i32
    New =   %trunc33 = trunc i64 %insert.ext39 to i32

thus store i32 %trunc33, ptr %d, align 8 becomes store i32 %sroa38, ptr %d, align 8

I am not sure what I should do now, alive2 tells me that the transforms seems correct https://alive2.llvm.org/ce/z/NfVsmT

nikic added a comment.Apr 23 2023, 8:43 AM

another question is can this method replace the original pattern match like match(Op1, m_One())? I worry about the performance of computeKnownBits().

It's okay to replace in this case. Division instructions are very rare, so this is not particularly performance sensitive.

I am not sure what I should do now, alive2 tells me that the transforms seems correct https://alive2.llvm.org/ce/z/NfVsmT

You can just regenerate the test. It's a fuzzer generated test case, so we don't really care as long as it doesn't infinite loop / miscompile.

address comments

  1. remove old pattern matching m_One() and m_ZExt()
  2. regenerate test PR51762 test in Transforms/InstCombine/zext-or-icmp.ll
nikic requested changes to this revision.Apr 24 2023, 2:20 AM

This looks good to me, but it looks like you need to regenerate some clang OpenMP tests as well. Not sure why/where those run InstSimplify, but the failures look legitimate.

llvm/lib/Analysis/InstructionSimplify.cpp
1059

I'd feel better about a == here.

This revision now requires changes to proceed.Apr 24 2023, 2:20 AM

regenerate failed clang OpenMP tests.

nikic accepted this revision.Apr 24 2023, 5:16 AM

LGTM. If you don't have commit access, can you please share the Name <email> to use for attribution?

This revision is now accepted and ready to land.Apr 24 2023, 5:16 AM
floatshadow added inline comments.Apr 24 2023, 5:17 AM
llvm/lib/Analysis/InstructionSimplify.cpp
1059

== is better. Thanks for the suggestion.

LGTM. If you don't have commit access, can you please share the Name <email> to use for attribution?

This my first revision LOL.
Siyuan Zhu, timeorange7071@outlook.com

This revision was landed with ongoing or failed builds.Apr 24 2023, 5:37 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 24 2023, 5:37 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript