This is an archive of the discontinued LLVM Phabricator instance.

[Runtime Unrolling] use a loop to simplify the runtime unrolling prologue.
ClosedPublic

Authored by kevin.qin on Sep 2 2014, 2:55 AM.

Details

Reviewers
hfinkel
Summary

Runtime unrolling will create a prologue to execute the extra iterations which is can't divided by the unroll factor. It generates an if-then-else sequence to jump into a factor -1 times unrolled loop body, like

extraiters = tripcount % loopfactor
if (extraiters == 0) jump Loop:
if (extraiters == loopfactor) jump L1
if (extraiters == loopfactor-1) jump L2
...
L1:  LoopBody;
L2:  LoopBody;
...
if tripcount < loopfactor jump End
Loop:
...
End:

It means if the unroll factor is 4, the loop body will be 7 times unrolled, 3 are in loop prologue, and 4 are in the loop.
This patch is to use a loop to execute the extra iterations in prologue, like

extraiters = tripcount % loopfactor
if (extraiters == 0) jump Loop:
else jump Prol
Prol:  LoopBody;
extraiters -= 1                 // Omitted if unroll factor is 2.
if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2.
if (tripcount < loopfactor) jump End
Loop:
...
End:

Then when unroll factor is 4, the loop body will be copied by only 5 times, 1 in the prologue loop, 4 in the original loop. And if the unroll factor is 2, new loop won't be created, just as the original solution.

On AArch64 target, if runtime unrolling enabled, after applying this patch, the code size will drop by 10%.

Also, the sequence of if-then-else sequence is saved, which could bring very slightly performance benefit, which is less than 0.1% on X86 and AArch64 target.

So overall, this patch can bring a lot of code size improvement, and have no harm to performance.

Is it OK to commit?

Thanks,
Kevin

Diff Detail

Event Timeline

kevin.qin updated this revision to Diff 13156.Sep 2 2014, 2:55 AM
kevin.qin retitled this revision from to [Runtime Unrolling] use a loop to simplify the runtime unrolling prologue..
kevin.qin updated this object.
kevin.qin edited the test plan for this revision. (Show Details)
kevin.qin added a subscriber: Unknown Object (MLST).

Do we need to add unrolling metadata to the prologue so that we don't re-unroll the prologue if we run the unrolling pass twice? Even though this does not happen in the current pipeline setup, it could potentially happen with a different setup (or with LTO, etc.).

kevin.qin updated this revision to Diff 13203.Sep 3 2014, 6:06 AM

Hi Hal,

This updated version added unrolling disable metadata to the loop in prologue. Thanks for your advice.

Cheers,
Kevin

kevin.qin updated this object.Sep 24 2014, 8:45 AM
hfinkel accepted this revision.Sep 26 2014, 11:29 AM
hfinkel added a reviewer: hfinkel.

LGTM, thanks!

This revision is now accepted and ready to land.Sep 26 2014, 11:29 AM
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in rL218604.