Page MenuHomePhabricator

[InstCombine] Allow to limit the max number of iterations
ClosedPublic

Authored by kuhar on Dec 6 2019, 12:41 PM.

Details

Summary

This patch teaches InstCombine to accept a new parameter: maximum number of iterations over functions.

InstCombine tries to simplify instructions by iterating over the whole function until the function stops changing. As a consequence, the last iteration before reaching a fixpoint visits all instructions in the worklist and never performs any rewrites.

Bounding the number of iterations can have 2 benefits:

  • In case the users of the pass can make a good guess about the number of required iterations, we can save the time normally spent on the last iteration that doesn't change anything.
  • When the wants to use InstCombine as a cleanup pass, it may be enough to run just a few iterations and stop even before reaching a fixpoint. This can be also useful for implementing a lightweight pass pipeline (think -O1).

This patch does not change the behavior of opt or Clang -- limiting the number of iterations is entirely opt-in.

Diff Detail

Event Timeline

kuhar created this revision.Dec 6 2019, 12:41 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: hiraditya. · View Herald Transcript

I like this idea. In addition to the stated uses, it can be used to improve debugging of what used to be infinite loops (see D71164 as a current example).
I think you're right that this doesn't change current behavior because:

++Iteration;
if (Iteration > MaxIterations) {

...will silently overflow because we initialized the max to UINT_MAX. So an infinite loop will still be an infinite loop. But what if we instead initialize that to some lower constant (1000?) and then assert that we haven't hit LimitMaxIterations?

llvm/test/Transforms/InstCombine/limit-max-iterations.ll
1

This looks like you used utils/update_test_checks.py, but then made some manual edits? Was the script output not good in some way? If not, I prefer to leave the auto-generated output as-is for easier updating in the future.

xbolva00 added a comment.EditedDec 9 2019, 7:17 AM

I like this technical solution more than "AggresiveInstCombine" concept (-O3 only).

Like you suggested, with -O2/-O1 we should use smaller number of iterations.

I just want to note that in general this doesn't limit the amount of folding,
we still process the entire worklist; we have to re-process entire worklist
because we are clearly inconsistent with how we fill said worklist.

kuhar updated this revision to Diff 232885.Dec 9 2019, 9:52 AM

Thanks for the suggestions.

I changed the default bound to 1000 and regenerated the filecheck directives with the update_test_checks script.

kuhar marked an inline comment as done.Dec 9 2019, 9:53 AM
lebedev.ri added inline comments.Dec 9 2019, 10:30 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
125

I think it should be

static constexpr unsigned InstCombineDefaultMaxIterations = std::numeric_limits<unsigned>::max - 1;

if the goal is to catch infinite loops while avoiding unintentional capping of iteration limit.

kuhar marked an inline comment as done.Dec 9 2019, 10:38 AM
kuhar added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
125

UINT_MAX is about 4*10^9. I don't think we will ever reach that number of iterations on any reasonable-sized input bitcode.

lebedev.ri added inline comments.Dec 9 2019, 10:46 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
125

Yes, that is kind-of the point, i'm not confident we won't reach 1000 in normal situations.

kuhar marked 2 inline comments as done.EditedDec 10 2019, 6:23 AM

I ran some experiments to see how many iterations of InstCombine are necessary on real-world C++ applications. I paste the results below.
The numbers were collected by running InstCombine in a Full-LTO-like scenario: first I compiled all the code with -O3 with *optimizations disabled*, then I linked it together (using gllvm), and gave it to opt with -O3.

NameBitcode size [MB]Max IterationsTotal InstCombine ExecutionsTotal IterationsAverage Iterations Per Execution
sqlite3.bc3512685179071.41
wasm-dis.bc2542950023613021.22
wasm-metadce.bc2542997573671171.22
wasm-reduce.bc2542986673658021.22
wasm-as.bc2542960613625981.22
wasm-emscripten-finalize.bc2542966703633281.22
wasm-ctor-eval.bc2542980523651951.23
wasm-shell.bc2542997013673291.23
wasm-opt.bc2643109913815251.23
wasm2js.bc2643104833804721.23
asm2wasm.bc2643101533801591.23
llvm-dis.bc2062259572906321.29
llvm-as.bc4484787346199401.29
rippled.bc42720113024214441801.28
opt.bc1848164066421473811.31
lld.bc2108190223624793781.30
clang-9.bc3978350530845030681.28

The maximum number of iterations required was 20, while check-llvm seems to require 4. This is in-line with what I see in the GPU code I'm working on.
Based on this data, I think that 1000 should be a safe limit that can actually detect infinite loops in finite time.
@lebedev.ri

That being said, it'd be interesting to hand-craft an IR generator that forces instcombine to run arbitrary many iteration before reaching a fixpoint. Do you have an idea for a code pattern that will force such a behavior?

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
125

See my full answer below.

spatel added inline comments.Dec 10 2019, 7:52 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
125

If we want to be safer, we could set the limit to UINT_MAX in this patch, then make a follow-up that just changes that single constant value.

Based on the data so far, either way seems ok to me.

kuhar updated this revision to Diff 233105.Dec 10 2019, 7:56 AM
kuhar marked an inline comment as done.

Change the default iteration limit to UINT_MAX - 1.

kuhar marked 3 inline comments as done.Dec 10 2019, 7:57 AM
kuhar added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
125

Sounds good to me. Making the bound tighter can be a follow-up patch.

kuhar marked an inline comment as done.Dec 10 2019, 8:40 AM
lebedev.ri added inline comments.Dec 10 2019, 8:46 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3577–3578

If i'm reading this right this will not allow passing instcombine-max-iterations bigger than the default
(i.e. it will assert when the default default is reached)

kuhar marked an inline comment as done.Dec 10 2019, 9:39 AM
kuhar added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3577–3578

Yes, that's the intention. Do you suggest an alternative behavior here?

kuhar marked an inline comment as done.Dec 10 2019, 2:01 PM
kuhar added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3674

Forgot to update the constant here. Will fix in the next revision.

LGTM. The table you provided was very interesting to see, thanks!

In my experience extra iterations are often caused by bad worklist management when updating nodes sometimes. For example https://reviews.llvm.org/D50990 was supposed to fix one case of that, but I never got around to finding the original test where I observed the issue. If that patch still applies does it have any effect on any of the extra iterations in the table?

kuhar added a comment.EditedDec 11 2019, 1:17 PM

In my experience extra iterations are often caused by bad worklist management when updating nodes sometimes. For example https://reviews.llvm.org/D50990 was supposed to fix one case of that, but I never got around to finding the original test where I observed the issue. If that patch still applies does it have any effect on any of the extra iterations in the table?

Sure, below is a new table generated with your patch applied. I can only see a very small difference.

NameBitcode size [MB]Max IterationsTotal InstCombine ExecutionsTotal IterationsAverage Iterations Per Execution
sqlite3.bc3512685178891.41
wasm-dis.bc2542950023612941.22
wasm-shell.bc2542997013673211.23
wasm-emscripten-finalize.bc2542966703633201.22
wasm-as.bc2542960613625901.22
wasm-reduce.bc2542986673657941.22
wasm-ctor-eval.bc2542980523651871.23
wasm-metadce.bc2542997573671091.22
wasm-opt.bc2643109913815171.23
wasm2js.bc2643104833804641.23
llvm-dis.bc2062259572905501.29
asm2wasm.bc2643101533801511.23
llvm-as.bc4484787346197851.29
rippled.bc42720113024214439421.28
opt.bc1848164066421468321.31
lld.bc2108190223624787721.3
clang-9.bc3978350530845021591.28
kuhar added a comment.Dec 17 2019, 8:04 AM

@lebedev.ri @spatel @craig.topper
Do you have any other feedback or questions/concerns about the patch?

lebedev.ri resigned from this revision.Dec 17 2019, 10:02 AM

@lebedev.ri @spatel @craig.topper
Do you have any other feedback or questions/concerns about the patch?

I'm not sure what problem this solves, and how successfully, so i can't review this.

spatel added inline comments.Dec 17 2019, 10:26 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3674

This is marked as 'Done' but UINT_MAX should be changed to InstCombineDefaultMaxIterations?

kuhar marked an inline comment as done and an inline comment as not done.Dec 17 2019, 10:35 AM
kuhar added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
3674

Yes. I have it fixed locally like you mentioned.

spatel accepted this revision.Dec 17 2019, 10:40 AM

LGTM

This revision is now accepted and ready to land.Dec 17 2019, 10:40 AM

I think it may be worthwhile to separate the limits for "stop after N iterations" and "assert after N iterations". In particular what I have in mind is that we can use -instcombine-max-iterations=2 in tests to ensure that correct worklist management allows everything to be folded in a single iteration (with the second establishing the fixed point). However, for this purpose -instcombine-max-iterations would actually need to assert. Possibly there's room for two separate options?

I think it may be worthwhile to separate the limits for "stop after N iterations" and "assert after N iterations". In particular what I have in mind is that we can use -instcombine-max-iterations=2 in tests to ensure that correct worklist management allows everything to be folded in a single iteration (with the second establishing the fixed point). However, for this purpose -instcombine-max-iterations would actually need to assert. Possibly there's room for two separate options?

This sounds reasonable, but I think it would be better to consider this, together with making the default max lower, as a separate patch. What do you think?

I think it may be worthwhile to separate the limits for "stop after N iterations" and "assert after N iterations". In particular what I have in mind is that we can use -instcombine-max-iterations=2 in tests to ensure that correct worklist management allows everything to be folded in a single iteration (with the second establishing the fixed point). However, for this purpose -instcombine-max-iterations would actually need to assert. Possibly there's room for two separate options?

This sounds reasonable, but I think it would be better to consider this, together with making the default max lower, as a separate patch. What do you think?

Yeah, sounds good.

kuhar updated this revision to Diff 234580.Dec 18 2019, 10:45 AM

Replace UINT_MAX with InstCombineDefaultMaxIterations

kuhar marked 2 inline comments as done.Dec 18 2019, 10:46 AM
This revision was automatically updated to reflect the committed changes.