This is an archive of the discontinued LLVM Phabricator instance.

Throttling LICM to reduce compile time of Value Propagation and Reg Splitting
Needs ReviewPublic

Authored by wmi on Mar 10 2016, 3:23 PM.

Details

Summary

The is the part2 to solve https://llvm.org/bugs/show_bug.cgi?id=10584

For the testcases in the bug, LICM creates many long stretched vars living across many BBs. When CVP queries Lazy Value Information (LVI) for such vars, for each var, LVI would go through all the BBs it lives across. If there are many such long stretched vars, CVP will become very slow. Reg splitting has similar problem. For a vreg with very long live range, computing spill placement when splitting the vreg will take high cost.

The patch tries to throttle LICM to reduce the number of non-expensive instructions being hoisted outside of very large loop. For small loop, hoisting vars outside of such loop will not increase the cost much so we keep the current logic to hoist as many as possible. For large loop, expensive instructions like mul/div/rem/... will still be hoisted anyway. For non-expensive instructions in large loop, we use reg number to throttle how many instructions should be hoisted.

No unittest is added because I cannot find a small testcase for it.

Tested llvm testsuite and google internal benchmark on x86_64-linux-gnu and didn't find perf regression.

Diff Detail

Repository
rL LLVM

Event Timeline

wmi updated this revision to Diff 50363.Mar 10 2016, 3:23 PM
wmi retitled this revision from to Throttling LICM to reduce compile time of Value Propagation and Reg Splitting.
wmi updated this object.
wmi added reviewers: atrick, reames, hfinkel.
wmi set the repository for this revision to rL LLVM.
wmi added subscribers: llvm-commits, davidxl.
reames edited edge metadata.Mar 11 2016, 4:34 PM

I understand your thought process, but this approach is just not going to work. As a matter of policy, we canonicalize at the IR level and LICM is pretty much the classic definition of canonicalization. Restricting the aggressiveness of LICM to resolve a compile time problem elsewhere in the optimizer is fundamentally unacceptable.

I'm with phillip on this one.
If LVI is being asked about every variable, either
A. We should make something that produces the same info but does it not
lazily (the evaluation order that LVI uses is non-optimal)
B. We should stop asking about every variable.

LazyValueInfo is supposed to be for lazy queries. It is a backwards
solver. This is going to be the worst possible order to ask about things
in :)

There are better orderings and better solving strategies that will produce
identical info, far faster.
However, most of these can't be stopped "in the middle" like LVI can.
If we are asking for every variable, we should do that.

Given that meet/join/etc is already abstracted out pretty well, writing
such a solver using SparseSolver or something should be preetty trivial.
My guess is a couple hundred lines of code at most.
I would do that instead, and use it in passes that are asking about every
variable.

reames added a subscriber: reames.Mar 11 2016, 5:01 PM

On the topic of LVI vs Eager Value Info, I'm much less convinced than
Danny that jumping straight to that is a good idea. There's lots of
room to improve the implementation of LVI (e.g.
https://llvm.org/bugs/show_bug.cgi?id=26921). I'm not opposed to the
idea of an eager algorithm, but getting the eager code to handle all of
the cases LVI does is not as obvious as it first might seem. In
particular, LVI's handling of loops is surprisingly sophisticated and
non obvious. It's able to do fairly sophisticated inductive proofs of
constant ranges.

Philip

I have not read the patch in details, but it seems the main purpose of the patch to throttle LICM to reduce register pressure and the compile time benefit due to reduced LVI queries is just a good side effect of the patch? Do we have any performance data (runtime) showing the usefulness of the throttling? (Whether this is the right approach to solve the spill problem is also subject to discussions).

David,

I very deeply believe that limiting hoisting in the IR to reduce
register pressure is fundamentally the wrong approach. The backend is
responsible for sinking if desired to reduce register pressure. We do
not model register pressure in the IR. At all. Period.

Philip

reames resigned from this revision.Feb 25 2020, 9:01 AM

Resigning from a stale review (2016). Feel free to re-add if thread ever revived.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 25 2020, 9:01 AM
Herald added a subscriber: asbirlea. · View Herald Transcript