This is an archive of the discontinued LLVM Phabricator instance.

[Polly] Unroll and separate the remaining parts of isolation
ClosedPublic

Authored by gareevroman on Sep 11 2017, 8:04 AM.

Details

Summary

The remaining parts produced by the full partial tile isolation can contain hot spots that are worth to be optimized. Currently, we rely on the simple loop unrolling pass, LiCM and the SLP vectorizer to optimize such parts. However, the approach can suffer from the lack of the information about aliasing that Polly provides using additional alias metadata or/and the lack of the information required by simple loop unrolling pass.

This patch is the first step to optimize the remaining parts. To do it, we unroll and separate them. In case of, for instance, Intel Kaby Lake, it helps to increase the performance of the generated code from 39.87 GFlop/s to 49.23 GFlop/s.

The next possible step is to avoid unrolling performed by Polly in case of isolated and remaining parts and rely only on simple loop unrolling pass and the Loop vectorizer.

Diff Detail

Repository
rL LLVM

Event Timeline

gareevroman created this revision.Sep 11 2017, 8:04 AM
grosser accepted this revision.Sep 11 2017, 8:47 AM

LGTM.

This revision is now accepted and ready to land.Sep 11 2017, 8:47 AM
This revision was automatically updated to reflect the committed changes.