This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Process blocks in RPO
ClosedPublic

Authored by nikic on Feb 28 2020, 9:04 AM.

Details

Summary

InstComine currently processes blocks in some ill-defined depth-first order. This can break the usual invariant that the operands of an instruction should be simplified before the instruction itself, if uses across basic blocks (particularly inside phi nodes) are involved.

This patch switches the initial worklist population to use RPO instead, which will ensure that predecessors are visited before successors (back-edges notwithstanding).

This allows us to fold more cases within a single InstCombine iteration. The broader context here is that I want to limit InstCombine to a single iteration in the future (removing the current fix-point iteration), which will give a large (5% end-to-end) compile-time improvement without substantial optimization impact. However, this requires eliminating as many cases where we fail to reach the fix point in one iteration as possible, and the current worklist population order is one of the main issues.

In the meantime, this does cause a small compile-time regression of about 0.1% (http://llvm-compile-time-tracker.com/compare.php?from=725fcf40c3e55b2c03a1ed2326375984c0a8560f&to=8d0280338cd9409ec6fa4dbf86f4c2a9dfa57c60&stat=instructions:u), because calculating RPO is more expensive than one would think.

Diff Detail

Event Timeline

nikic created this revision.Feb 28 2020, 9:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 28 2020, 9:04 AM

Thanks, useful!

Did you observe any (positive) compile time changes?

test/Transforms/InstCombine/store.ll
2

since -infinite-loop-threshold now serves as a max treshold value, should we rename it?

-instcombine-iterations-threshold?

nikic marked an inline comment as done.Mar 1 2020, 7:08 AM

Did you observe any (positive) compile time changes?

I don't expect visible improvement from (just) this change, because it affects few cases (less than 1% of our InstCombine tests). My hope is that we can remove fixpoint iteration from InstCombine entirely in the future, once all the known issues are resolved. I'm still not sure if that's realistic, but I'm getting pretty close now...

test/Transforms/InstCombine/store.ll
2

Yes, the "infinite loop" terminology doesn't make a lot of sense for this usage. We also have a -instcombine-max-iterations setting, so it's a bit tricky to distinguish these two limits (one just stops, the other reports a fatal error).

nikic updated this revision to Diff 253422.Mar 29 2020, 7:58 AM

Reduce cost by computing RPOT only once.

nikic added a comment.Mar 29 2020, 1:35 PM

Did you observe any (positive) compile time changes?

It turned out that computing the RPO order is more expensive than I expected... After optimizing a bit, this change still clocks in as a 0.15% regression.

This looks good to me.
@spatel?

This looks good to me.
@spatel?

I don't have a sense of the value/cost trade-off other than what is noted in this review, so I don't have anything useful to say here.
Let's add some other reviewers and see if anyone else has experience/ideas.

A depth-first search is enough to ensure some predecessor of every block is visited before that block. So the benefit of RPO is to change that to all (non-loop) predecessors, which I guess helps optimizations involving PHI nodes?

I'm surprised this shows up in the compile-time statistics that way. Given the cost of everything else instcombine does, an RPO traversal shouldn't rank very high. Maybe worth adding timers to check whether the time is actually in this function, or we end up doing more work due to the order of instructions in the worklist. (Maybe we should be using SmallVector in ReversePostOrderTraversal?)

lebedev.ri requested changes to this revision.Jul 1 2020, 7:03 AM
This revision now requires changes to proceed.Jul 1 2020, 7:03 AM
lebedev.ri resigned from this revision.Jan 12 2023, 4:46 PM

This review seems to be stuck/dead, consider abandoning if no longer relevant.

This revision now requires review to proceed.Jan 12 2023, 4:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 4:46 PM
Herald added a subscriber: StephenFan. · View Herald Transcript
nikic planned changes to this revision.Jan 13 2023, 2:39 AM

This still seems like the right thing to do, but not pursing it right now.

nikic updated this revision to Diff 525628.May 25 2023, 8:20 AM
nikic edited the summary of this revision. (Show Details)
nikic edited reviewers, added: RKSimon, goldstein.w.n; removed: lebedev.ri, majnemer.

Rebase and put up for review again.

nikic updated this revision to Diff 533610.Jun 22 2023, 8:00 AM

Rebase due to test changes

aeubanks accepted this revision.Jul 28 2023, 10:44 AM

lgtm, assuming it's still necessary for D154579

This revision is now accepted and ready to land.Jul 28 2023, 10:44 AM
This revision was automatically updated to reflect the committed changes.