This is an archive of the discontinued LLVM Phabricator instance.

Increase tail dup threshold for -O3 from 3 to 4
ClosedPublic

Authored by rsmith on Aug 15 2017, 4:47 PM.

Details

Summary

We see a modest performance improvement from this slightly higher tail dup threshold (~0.1% geomean across the suite).

Diff Detail

Repository
rL LLVM

Event Timeline

rsmith created this revision.Aug 15 2017, 4:47 PM

Will test on some very large programs that are sensitive to icache pressure and get back.

davidxl edited edge metadata.Aug 16 2017, 1:21 PM

performance tests with large benchmarks showed no regression.

davidxl accepted this revision.Aug 16 2017, 1:21 PM

lgtm

This revision is now accepted and ready to land.Aug 16 2017, 1:21 PM
This revision was automatically updated to reflect the committed changes.
MatzeB added a subscriber: MatzeB.Aug 21 2017, 6:32 PM

For the record: This broke the greendragon "Project Clang Stage 1: cmake, RA, with expensive checks enabled" build (possibly hidden by other errors).

The output looks like something breaks instruction bundles into individual unbundled instructions while tail duplicating. This affects predicated ARMv7 code after if-conversion.

Given that it seems to be a pre-existing bug and only affects that one build I'd suggest to keep this commit in for now and I'll hopefully be able to fix the bug soon.

Pushed r311511 to stop taildup from unbundling instructions by accident. Post-commit reviews apreciated!