This is an archive of the discontinued LLVM Phabricator instance.

Remove 4,096 loop scale limitation.
ClosedPublic

Authored by dnovillo on Mar 31 2015, 5:07 AM.

Details

Summary

This is part 1 of fixes to address the problems described in
https://llvm.org/bugs/show_bug.cgi?id=22719.

The restriction to limit loop scales to 4,096 does not really prevent
overflows anymore, as the underlying algorithm has changed and does
not seem to suffer from this problem.

Additionally, artificially restricting loop scales to such a low number
skews frequency information, making loops of equal hotness appear to
have very different hotness properties.

The only loops that are artificially restricted to a scale of 4096 are
infinite loops (those loops with an exit mass of 0). This prevents
infinite loops from skewing the frequencies of other regions in the CFG.

At the end of propagation, frequencies are scaled to values that take no
more than 32 bits to represent. This is to prevent issues with
transformations that cannot deal with large numbers.

Tested on x86_64.

Diff Detail

Repository
rL LLVM

Event Timeline

dnovillo updated this revision to Diff 22939.Mar 31 2015, 5:07 AM
dnovillo retitled this revision from to Remove 4,096 loop scale limitation..
dnovillo updated this object.
dnovillo edited the test plan for this revision. (Show Details)
dnovillo added a reviewer: dexonsmith.
dnovillo added a subscriber: Unknown Object (MLST).

Generally looks good, seems like a nice practical step. Should let Duncan check the math before committing.

lib/Analysis/BlockFrequencyInfoImpl.cpp
339–348 ↗(On Diff #22939)

Mostly to check that I'm understanding this correctly, this will still result in really bizarre behavior where in some cases a non-infinite outer loop will be scaled higher than a finite inner loop.

Anyways, I would make this a FIXME as this may end up mattering more than I would like in practice.

430–431 ↗(On Diff #22939)

I would probably keep a FIXME here that we should be able to use the full 64-bits... Limiting this to 32-bits seems harmless but wasteful.

test/Analysis/BlockFrequencyInfo/bad_input.ll
35 ↗(On Diff #22939)

"scaled to max out at" -- missing 't'.

This revision was automatically updated to reflect the committed changes.