Page MenuHomePhabricator

[SCCP] Propagate integer range info for parameters in IPSCCP.
ClosedPublic

Authored by fhahn on Aug 13 2017, 3:01 PM.

Details

Summary

This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to ret i32 2 with
this change

source_filename = "sccp.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: norecurse nounwind readnone uwtable
define i32 @main() local_unnamed_addr #0 {
entry:
  %call = tail call fastcc i32 @f(i32 1)
  %call1 = tail call fastcc i32 @f(i32 47)
  %add3 = add nsw i32 %call, %call1
  ret i32 %add3
}

; Function Attrs: noinline norecurse nounwind readnone uwtable
define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
entry:
  %c1 = icmp sle i32 %x, 100

  %cmp = icmp sgt i32 %x, 300
  %. = select i1 %cmp, i32 1, i32 2
  ret i32 %.
}

attributes #1 = { noinline }

Diff Detail

Event Timeline

fhahn created this revision.Aug 13 2017, 3:01 PM
sanjoy added inline comments.Aug 13 2017, 3:40 PM
lib/Transforms/Scalar/SCCP.cpp
105

Use llvm::ConstantRange here.

1867

Design-wise I personally think it would be better if you instead had a second tryToReplaceWithConstantRange helper that propagated the range information and folded ICmps etc. wherever possible; instead of creating assumes.

But let's see what other folks think.

mcrosier edited the summary of this revision. (Show Details)Aug 14 2017, 9:57 AM

Thanks for working on this!

I think, in practice, what we should do is some combination of assumes and IPSCCP doing what it can.
I don't think we should throw away IPA info because the more local optimizations may still be able to do something better with it.
(IE i don't think IPSCCP will get everything, and so i feel we should expose what it discovers some way that is not ephemeral)

efriedma edited edge metadata.Aug 14 2017, 12:32 PM

Should we use range/nonnull metadata or assumptions to generate ranges?

Should IPSCCP itself try to fold operations based on ranges?

Inserting a bunch of "llvm.assume" calls is probably not a good idea; it's likely to be a big performance sink, and block other optimizations. (Not sure what the right answer looks like here, though; it's a hard problem.)

lib/Transforms/Scalar/SCCP.cpp
143

cast<>, not dyn_cast<>

1643

?

Should we use range/nonnull metadata or assumptions to generate ranges?

If we have such metadata, yes

Should IPSCCP itself try to fold operations based on ranges?

This is a definite yes for me, regardless of anything else. Even if it only does it to cut down on metadata generated.
I don't think it should go crazy trying to do so (IE Put another way, this pass is currently linear. We should not make it N^2 by trying to fold range operations)

Inserting a bunch of "llvm.assume" calls is probably not a good idea; it's likely to be a big performance sink, and block other optimizations. (Not sure what the right answer looks like here, though; it's a hard problem.)

FWIW: I'm curious what performance problems you see. It would seem bad to have an intrinsic that we can't use a lot :)
I also thought Hal had taken care of most performance issues with assume?

There have been improvements, but we still have a lot of transforms that aren't assume-aware; essentially every call to hasOneUse() does the wrong thing, etc.

And this transform in particular is creating up to four new IR instructions per existing instruction in the IR, which is going to be a general drain on performance for anything examining the IR; we need some sort of rule for figuring out what's worth annotating.

Should we use range/nonnull metadata or assumptions to generate ranges?

Should IPSCCP itself try to fold operations based on ranges?

Inserting a bunch of "llvm.assume" calls is probably not a good idea; it's likely to be a big performance sink, and block other optimizations. (Not sure what the right answer looks like here, though; it's a hard problem.)

An alternative would be to just attach range metadata to arguments/calls, and then rely on LVI for the intraprocedural work.

davide edited edge metadata.Aug 16 2017, 12:36 AM

Thanks for your work!
FWIW, I agree with Eli/Sanjoy that we shouldn't rely on assumptions.
Another comment that I have is: instead of using a single lattice for constants and ranges [and *], have you considered looking at something akin to what GCC does? https://github.com/gcc-mirror/gcc/blob/master/gcc/ipa-cp.c#L307
They have different lattices for constants/aggregates/bits/ranges. I find their code easier to read/reason about than LLVM's.

fhahn added a comment.Aug 16 2017, 9:01 AM

Thanks for all the feedback! I'll update the patch soon.

Thanks for your work!
FWIW, I agree with Eli/Sanjoy that we shouldn't rely on assumptions.
Another comment that I have is: instead of using a single lattice for constants and ranges [and *], have you considered looking at something akin to what GCC does? https://github.com/gcc-mirror/gcc/blob/master/gcc/ipa-cp.c#L307
They have different lattices for constants/aggregates/bits/ranges. I find their code easier to read/reason about than LLVM's.

Thanks for pointing me to that. I'll look into adding a separate range lattice.

re using assume: It shouldn't be to hard to use the range lattice to eliminate compare instructions in the IPSCCP pass.

gberry added a subscriber: gberry.Aug 18 2017, 8:48 AM
fhahn updated this revision to Diff 111895.Aug 20 2017, 11:16 AM

Updated the patch to use LVILatticeValue (which I moved to a separate header file for now, I'd move it out in a separate patch if we decide to use it) for tracking function parameter values and updated tryToReplaceWithConstant to use range information to removed ICmp instructions.

I think it makes sense to use LVILatticeValue for parameters, as it already contains most of the functionality needed

This patch only uses constant ranges for function parameters. This is good enough to fix PR33253, but I think using constant range for all values in the solver would make it more useful for real programs.

fhahn updated this revision to Diff 112577.Aug 24 2017, 10:32 AM

Update the patch to support all predicates through ConstantRange functions and use a SmallPtrSet to keep track of ICmp instructions to delete (deleting them straight away would invalidate iterators). It would be great if you could have another look now. If that approach is sensible I'll add another review that moves LVILatticeVal to a separate module properly.

fhahn added a comment.Sep 4 2017, 10:42 AM

ping, any thoughts?

I think this is on the right direction (for me), but let's see what other think.
If we agree, it will probably make sense revamping the general semilattice engine to plug arbitrary lattices in (and move the constant semilattice out).
This will enable future improvements (i.e. support for IPA bitwise CP, see, e.g. https://patchwork.ozlabs.org/patch/657644/)

include/llvm/Analysis/ValueLattice.h
123–125 ↗(On Diff #112577)

Are you expecting the resolver to plug the correct value here?

lib/Transforms/Scalar/SCCP.cpp
1636

I think it's better to split this in two APIs (tryToReplaceWithConstant & tryToReplaceWithConstantRange)

fhahn updated this revision to Diff 114252.Sep 7 2017, 1:48 PM
fhahn marked an inline comment as done.

Added D37591 to move LVILatticeVal to separate file and moved code to replace ICmp instructions to separate function

fhahn added inline comments.
include/llvm/Analysis/ValueLattice.h
123–125 ↗(On Diff #112577)

I am not sure what you mean here. Are you referring to the assert? The caller is responsible to make sure that markConstant is not called with different constant values (e.g. in mergeIn).

lib/Transforms/Scalar/SCCP.cpp
1636

Yes and it makes the code quite a bit simpler too

fhahn added a comment.Sep 26 2017, 6:55 AM

Ping. The commit moving the LVI lattice out to a separate file is ready to go in ( D37591 ). I would go ahead and commit the change soon, if you think it's a good idea to use it in SCCP for range tracking.

dberlin edited edge metadata.Sep 26 2017, 3:35 PM

FWIW: I think this is a good approach to start with (and is small enough that if we decide to go another way, it doesn't hurt anything).

I think the same.

fhahn updated this revision to Diff 116985.Sep 28 2017, 7:21 AM
fhahn retitled this revision from [SCCP] Propagate integer range information in IPSCCP (WIP). to [SCCP] Propagate integer range information in IPSCCP..
fhahn edited the summary of this revision. (Show Details)

Rebased after D37591 was committed

fhahn added a comment.Oct 7 2017, 2:04 PM

ping. Seems like there is agreement that this is a good start. If that is still the case, it would be great if someone could formally approve the patch :)

dberlin added inline comments.Oct 9 2017, 7:25 AM
test/Transforms/SCCP/ip-constan-ranges.ll
90

Is there a reason not to track ranges for variables who uses appear as arguments?
IE start with each function call, and for the arguments, mark them so they get a bigger lattice.
It can be done as a followup, i'm just trying to understand.

fhahn added inline comments.Oct 9 2017, 10:23 AM
test/Transforms/SCCP/ip-constan-ranges.ll
90

I thought it would be better to be conservative to start with and using ValueLatticeElement as general lattice value type seemed to slightly too heavy handed. I was thinking looking into some of the suggestions made in https://bugs.llvm.org/show_bug.cgi?id=26921 to reduce the memory consumption of ValueLatticeElement.

What do you think?

dberlin accepted this revision.Oct 9 2017, 11:10 AM

Sounds reasonable to start with.

This revision is now accepted and ready to land.Oct 9 2017, 11:10 AM
davide accepted this revision.Oct 9 2017, 11:24 AM

Minors, otherwise, looks good. Feel free to submit once you addressed them without another round of review.

lib/Transforms/Scalar/SCCP.cpp
31

Unsorted.

1611

If would be nicer if you broke earlier to reduced the indentation, but, I mean, I/you can probably clean this up as a follow up.

1617

unformatted.

1658

can you please clang format?

fhahn updated this revision to Diff 118268.Oct 9 2017, 2:03 PM

Fixed formatting and renamed toLVI to toValueLattice. Thanks for the reviews, I will commit it tomorrow morning, unless there are any further objections.

fhahn updated this revision to Diff 118272.Oct 9 2017, 2:22 PM

another small update. Use early exit and continue in tryToReplaceWithConstantRange as @davide suggested

fhahn retitled this revision from [SCCP] Propagate integer range information in IPSCCP. to [SCCP] Propagate integer range info for parameters in IPSCCP..Oct 10 2017, 2:32 AM
fhahn closed this revision.Oct 10 2017, 2:32 AM

There was a problem with this patch in combination with LTO and it has been reverted. I will investigate soon.

I suspect at least one of your bugs is that the param state lattice doesn't get updated when the value state lattice changes.

lib/Transforms/Scalar/SCCP.cpp
329

Use find so you aren't doing this lookup twice.
Or even better, please use insert to make space if it doesn't exist, it won't screw up your assert.

333

This seems wrong.
What happens if the value state lattice value changes?
This will never update it.

336

Then you can just return the result of the insert here.

471

Ditto above on how to do this without repeated lookups

472

Ditto above, if the value changes you will never update it from the value state.

1609

You already assert it's an integer type above?
Choose a single place to do it ;)

Thanks for having another look and for your comments. I'll address them soon. I am not sure about ValueState and ParamState entries diverging (I responded with details inline), but I am probably missing something.

lib/Transforms/Scalar/SCCP.cpp
333

My understanding was that this function only gets called after the solver finished, meaning the value in ValueState should not change.

472

Hm, this function is only used to get the state for function parameters. I thought the only place where we change the state for function parameters is at line 1194 and there is no other place where the ValueState entries of function parameters could be changed. At least that was what I intended :)

fhahn updated this revision to Diff 120592.Oct 27 2017, 7:25 AM
fhahn added a subscriber: bruno.

I finally managed to track down the crash with the clang-stage2-configure-Rlto bot, thanks to @bruno. The problem was that tryToReplaceWithConstantRange iterates over all the users of a value, and calls Solver.getLatticeValueFor for the operands of ICmp instructions (which asserts the operand has been added to the lattice). For Icmp instructions in unreachable blocks, this is not the case.

I've updated tryToReplaceWithConstantRange to only consider Icmp instructions in executable blocks. This fixes the crash. I also updated the code to use insert, as suggested by @dberlin. I think with the crash addressed, the patch should be ready for re-committing, unless there are any remaining objections.