This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][GlobalISel] Enable the localizer for optimized builds
ClosedPublic

Authored by aemerson on Sep 6 2019, 2:27 PM.

Details

Summary

Despite the fact that the localizer's original motivation was to fix horrendous constant spilling at -O0, shortening live ranges still has net benefits even with optimizations enabled.

On an -Os build of CTMark, doing this improves code size by 0.5% geomean.

There are a few regressions, bullet increasing in size by 0.5%. One example from bullet where code size increased slightly was due to GlobalISel actually now generating the same code as SelectionDAG. So we actually have an opportunity in future to implement better heuristics for localization and therefore be *better* than SDAG in some cases. In relation to other optimizations though that one is relatively minor.

Diff Detail

Repository
rL LLVM

Event Timeline

aemerson created this revision.Sep 6 2019, 2:27 PM
paquette accepted this revision.Sep 6 2019, 2:48 PM

LGTM

This revision is now accepted and ready to land.Sep 6 2019, 2:48 PM
qcolombet accepted this revision.Sep 6 2019, 2:51 PM
qcolombet added a subscriber: qcolombet.

Hi Amara,

Where are the benefits coming from for optimized build?
I am guessing less copies/spills and in that case, I think we should try to fix the allocator, but that's more a long term plan.
Fill a PR though with an example so that we don't forget.

LGTM.

Cheers,
-Quentin

Hi Amara,

Where are the benefits coming from for optimized build?
I am guessing less copies/spills and in that case, I think we should try to fix the allocator, but that's more a long term plan.
Fill a PR though with an example so that we don't forget.

LGTM.

Cheers,
-Quentin

The swifterror test change in this patch is one example. Live ranges over function calls can be problematic if under register pressure, which can cause additional moves, if not spills. There are others of course. The greedy allocator just doesn't seem to expect long live ranges across basic-block boundaries.

This revision was automatically updated to reflect the committed changes.

The swifterror test change in this patch is one example. Live ranges over function calls can be problematic if under register pressure, which can cause additional moves, if not spills. There are others of course

Is it happening in this specific test (more moves or spill)?
I'd like to have something to reproduce the issue.