This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyIndvar] Replace the sdiv used by IV if we can prove both of its operands are non-negative
ClosedPublic

Authored by etherzhhb on Mar 29 2017, 8:38 PM.

Details

Summary

Since there is no sdiv in SCEV, an 'udiv' is a better canonical form than an 'sdiv' as the user of induction variable

Diff Detail

Repository
rL LLVM

Event Timeline

etherzhhb created this revision.Mar 29 2017, 8:38 PM
etherzhhb added inline comments.
lib/Transforms/Utils/SimplifyIndVar.cpp
291–293

maybe I should also copy the 'exact' flag?

grosser edited edge metadata.Mar 29 2017, 11:12 PM

Hi Hongbin,

I am not sure what you are trying to achieve here. This kernel is already optimized by -instcombine to

define void @test(i32* %a) {
entry:
  br label %for.body

for.body:                                         ; preds = %for.body, %entry
  %i.01 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
  %div = shl nsw i32 %i.01, 5
  %idxprom = sext i32 %div to i64
  %arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
  store i32 %i.01, i32* %arrayidx, align 4
  %inc = add nsw i32 %i.01, 1
  %cmp = icmp slt i32 %i.01, 63
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

So the sdiv is replaced by a lsh, taking into account the information that sdiv is working on unsigned inputs only.

If you divide by a non-power-of-two, instcombine does not change the code and leaves the sdiv in. However, if you would like to move to an sdiv even in this case this should probably be added to instcombine.

I would also be interested to learn why an 'udiv' is a better canonical form than an 'sdiv'. Is it faster to execute (non-power-of-two) udivs on FPGAs than sdivs?

Hi Tobias

Hi Hongbin,

I am not sure what you are trying to achieve here. This kernel is already optimized by -instcombine to

This is based on the discussion in https://groups.google.com/d/topic/llvm-dev/HKK9ffG-dIU/discussion

define void @test(i32* %a) {
entry:
  br label %for.body

for.body:                                         ; preds = %for.body, %entry
  %i.01 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
  %div = shl nsw i32 %i.01, 5
  %idxprom = sext i32 %div to i64
  %arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
  store i32 %i.01, i32* %arrayidx, align 4
  %inc = add nsw i32 %i.01, 1
  %cmp = icmp slt i32 %i.01, 63
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

So the sdiv is replaced by a lsh, taking into account the information that sdiv is working on unsigned inputs only.

If you divide by a non-power-of-two, instcombine does not change the code and leaves the sdiv in. However, if you would like to move to an sdiv even in this case this should probably be added to instcombine.

Yes, see below.

I would also be interested to learn why an 'udiv' is a better canonical form than an 'sdiv'. Is it faster to execute (non-power-of-two) udivs on FPGAs than sdivs?

I am focusing on the induction variable and its users here. I am trying to get an AddRec from arrayidx.
However, there is no sdiv in SCEV, and SCEV will give up when it see a sdiv. From this perspective udiv is better then sdiv. I think I should explicitly explain this in the summary

Let me also look at instcombine, the original motivation to put it here is mainly because SimplifyIndVar use SCEV, which provide better range knowledge about induction variables.

etherzhhb edited the summary of this revision. (Show Details)Mar 29 2017, 11:35 PM

The relevant function is visitSDiv in InstCombineMulDivRem.cpp. Maybe we have indeed insufficient information in instcombine.

etherzhhb updated this revision to Diff 93451.Mar 30 2017, 12:51 AM

Add more testcases

etherzhhb marked an inline comment as done.Mar 30 2017, 12:56 AM
etherzhhb added inline comments.
test/Transforms/IndVarSimplify/replace-sdiv-by-udiv.ll
11

instcombine (which use computeKnownBits) is not able to prove the status of the signbit of %i.01 here

sanjoy requested changes to this revision.Mar 30 2017, 12:52 PM
sanjoy added a reviewer: sanjoy.

Comments inline.

You can also do this same rewrite within SCEV. That is, try to have getSCEV(sdiv instruction) return a SCEVUDivExpr if legal. However, given that udiv is more canonical, this patch is good too if it solves your problem.

lib/Transforms/Utils/SimplifyIndVar.cpp
38

s/eliminated/converted to unsigned division/

277

This seems unnecessary - why do you not care about optimizing

%val = sdiv <non-iv-known-positive>, <iv>
281

Use auto *.

286

I'd use N and D instead of S and X.

291

Use auto *.

299

Why not push_back?

This revision now requires changes to proceed.Mar 30 2017, 12:52 PM
etherzhhb updated this revision to Diff 93547.Mar 30 2017, 2:51 PM
etherzhhb edited edge metadata.

address sanjoy 's comment

etherzhhb marked 6 inline comments as done.Mar 30 2017, 2:51 PM
sanjoy accepted this revision.Mar 30 2017, 2:56 PM

lgtm with minor comments

lib/Transforms/Utils/SimplifyIndVar.cpp
39

s/NumElimSDiv/NumSimplifiedSDiv/

275

Comment is wrong.

I'd actually just drop the comment. It is obvious enough what the code is doing here.

This revision is now accepted and ready to land.Mar 30 2017, 2:56 PM

Comments inline.

You can also do this same rewrite within SCEV. That is, try to have getSCEV(sdiv instruction) return a SCEVUDivExpr if legal.

This also sounds reasonable and I can also look at that after this.
Just a quick question: are we suppose to use getSCEVAtScope in getSCEV?

However, given that udiv is more canonical, this patch is good too if it solves your problem.

etherzhhb updated this revision to Diff 93550.Mar 30 2017, 3:05 PM

fix the last few things before pushing the commit

etherzhhb marked 2 inline comments as done.Mar 30 2017, 3:06 PM
This revision was automatically updated to reflect the committed changes.

Comments inline.

You can also do this same rewrite within SCEV. That is, try to have getSCEV(sdiv instruction) return a SCEVUDivExpr if legal.

This also sounds reasonable and I can also look at that after this.
Just a quick question: are we suppose to use getSCEVAtScope in getSCEV?

Not in general. The SCEV expressions you return from getSCEV need to be context insensitive.

However, given that udiv is more canonical, this patch is good too if it solves your problem.