This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Lower infinite combine loop detection thresholds
ClosedPublic

Authored by lebedev.ri on Jul 4 2020, 9:32 AM.

Details

Summary

1000 iteratons is still kinda a lot.
Would it make sense to iteratively lower it, until it becomes 2,
with some delay inbetween in order to let users actually potentially encounter it?

Diff Detail

Event Timeline

lebedev.ri created this revision.Jul 4 2020, 9:32 AM
kuhar added a comment.Jul 4 2020, 10:46 AM

Overall, I think this is a good direction, but I'd like to understand better why InstCombine needs more than a few iterations and how 'fix' these cases without bailing out after a fixed number of iterations.
I'd be interested to find out if we can make an IR pattern generator that forces InstCombine to run N iterations. Are there any algorithmic guarantees of the current implementation which we could use to show that InstCombine doesn't go into an infinite loop?

Overall, I think this is a good direction, but I'd like to understand better why InstCombine needs more than a few iterations and how 'fix' these cases without bailing out after a fixed number of iterations.
I'd be interested to find out if we can make an IR pattern generator that forces InstCombine to run N iterations. Are there any algorithmic guarantees of the current implementation which we could use to show that InstCombine doesn't go into an infinite loop?

@nikic can say more, but the general idea is that instcombine traditionally didn't pay much attention
to adding instructions-to-be-revisited back into worklist after changing something,
so we wouldn't revisit some instructions until we do the next iteration with worklist containing all function's instructions.

The main fix is to ensure that we consistently replenish worklist.
The fix is NOT to bailout after a number of iterations.

nikic added a comment.Jul 5 2020, 10:29 AM

Overall, I think this is a good direction, but I'd like to understand better why InstCombine needs more than a few iterations and how 'fix' these cases without bailing out after a fixed number of iterations.
I'd be interested to find out if we can make an IR pattern generator that forces InstCombine to run N iterations. Are there any algorithmic guarantees of the current implementation which we could use to show that InstCombine doesn't go into an infinite loop?

@nikic can say more, but the general idea is that instcombine traditionally didn't pay much attention
to adding instructions-to-be-revisited back into worklist after changing something,
so we wouldn't revisit some instructions until we do the next iteration with worklist containing all function's instructions.

The main fix is to ensure that we consistently replenish worklist.
The fix is NOT to bailout after a number of iterations.

Right. I think the three most common sources of additional InstCombine iterations are:

  • Dead instructions are not added to the worklist.
  • Changed/dependent instructions are not added to the worklist.
  • Instruction scans are performed forward instead of backward.

I spent some time a few months ago eliminating such issues and IIRC got most/all InstCombine tests to run in 3 iterations. I haven't found any cases where InstCombine genuinely needed many iterations and it was not just an issue of worklist management.

Incrementally dropping the limit sounds like a good idea to me and makes it more likely that we'll become aware of issues (and probably also more likely that fuzzing can find them). As suggested inline, I would limit this to assertion-enabled builds, to avoid impacting end users hitting pathological cases.

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
133

Not really important for 100, which is still a very conservative value, but going forward I would split this up according to NDEBUG:

static constexpr unsigned InstCombineDefaultMaxIterations = 1000;
#ifndef NDEBUG
static constexpr unsigned InstCombineDefaultInfiniteLoopThreshold = 100;
#else
static constexpr unsigned InstCombineDefaultInfiniteLoopThreshold = 1000;
#endif

In particular, I think we want to avoid making the end user compiler crash if they hit cases where instcombine is very slow, but still converges.

lebedev.ri updated this revision to Diff 275580.Jul 5 2020, 3:27 PM
lebedev.ri marked an inline comment as done.

Addressing review notes.

nikic accepted this revision.Jul 6 2020, 12:22 AM

LGTM

This revision is now accepted and ready to land.Jul 6 2020, 12:22 AM
This revision was automatically updated to reflect the committed changes.