This is an archive of the discontinued LLVM Phabricator instance.

[LSR] Combine unfolded offset into invariant register
ClosedPublic

Authored by gilr on Sep 10 2018, 8:57 AM.

Details

Summary

LSR reassociates constants as unfolded offsets when the constants fit as immediate add operands, which currently prevents such constants from being combined later with loop invariant registers.

This patch modifies GenerateCombinations() to generate a second formula which includes the unfolded offset in the combined loop-invariant register.

Diff Detail

Repository
rL LLVM

Event Timeline

gilr created this revision.Sep 10 2018, 8:57 AM
qcolombet accepted this revision.Oct 8 2018, 3:26 PM

LGTM.

One question, what is the impact on compile time?
Put differently are the heuristics to prune the search space able to still reduce the space to something manageable?

This revision is now accepted and ready to land.Oct 8 2018, 3:26 PM
gilr added a comment.Oct 23 2018, 7:45 AM

Thanks Quentin!
Regarding effect on compile time please see attached CTMark results on x86.

Thanks for the detailed numbers!

This revision was automatically updated to reflect the committed changes.
gilr reopened this revision.Oct 28 2018, 5:55 AM
This revision is now accepted and ready to land.Oct 28 2018, 5:55 AM
gilr updated this revision to Diff 171424.Oct 28 2018, 6:03 AM

Bug fix: ScalarEvolution::getAddExpr() may modify the vector of SCEVs it is given as argument, causing 2nd formula to be created with only part of the SCEVs in Ops.
This patch fixes this by using a temporary copy of Ops for the 1st formula.

gilr added a reviewer: delena.Oct 28 2018, 6:04 AM
gilr requested review of this revision.Oct 28 2018, 1:01 PM

Bug fix: ScalarEvolution::getAddExpr() may modify the vector of SCEVs it is given as argument, causing 2nd formula to be created with only part of the SCEVs in Ops.
This patch fixes this by using a temporary copy of Ops for the 1st formula.

Is this covered by an existing test case?
If not, could you add one?

gilr added a comment.Nov 1 2018, 9:03 AM

Is this covered by an existing test case?
If not, could you add one?

The original patch failed several runtime tests of the LLVM test-suite: n-body, oggenc, HPCCG and PENNANT, where LSR produced an incorrect yet legal formula based on the modified vector of SCEVs, leading to semantically wrong generated code.
Fully covering this bug by a LIT test requires a test case where (a) LSR generates both combinations, (b) SCEV erases one of the registers from the Ops vector and (c) the second combination is included in the solution. The LIT tests included in the original patch do (a) + (c) but I couldn't get them to also do (b). The failing runtime tests do (a) and (b) but not (c).

The failing runtime tests do (a) and (b) but not (c).

I am confused. If they don't do (c) how can they fail?

The original patch failed several runtime tests of the LLVM test-suite: [...]

Can't we extract a test from there?

gilr added a comment.Nov 2 2018, 6:56 AM

I am confused. If they don't do (c) how can they fail?

The bad formula is cheap (due to the missing register) and therefore enters the solution and produces legal yet incorrect code which fails at runtime. With the fix, the correct formula doesn't make it to the solution and doesn't affect the generated code.

Can't we extract a test from there?

Since testing for optimized or illegal code in these cases isn't an option in these cases, such a LIT could check that the incorrect code is not generated, that the incorrect formula is not generated, or that the correct formula is generated - which all seem fragile and overfitted to this specific bug, so I'm reluctant to add any of them (especially since the bug is covered by 4 runtime tests). What do you think?

Since testing for optimized or illegal code in these cases isn't an option in these cases, such a LIT could check that the incorrect code is not generated, that the incorrect formula is not generated, or that the correct formula is generated - which all seem fragile and overfitted to this specific bug, so I'm reluctant to add any of them (especially since the bug is covered by 4 runtime tests). What do you think?

As long as what we check if that produces the correct sequence for the problematic case, that doesn't seem more fragile than any of the existing test cases.
I would rather have this small test than looking in runtime issues in general.

gilr updated this revision to Diff 172576.Nov 5 2018, 6:07 AM

Added a test case to cover the bug introduced by the first patch and fixed in the second.

qcolombet added inline comments.Nov 5 2018, 8:55 AM
test/Transforms/LoopStrengthReduce/two-combinations-bug.ll
2 ↗(On Diff #172576)

Could you reduce the test even more?
E.g., I usually had some success by adding an assert for the case that interests me and bugpoint on this specific assert (via custom script)

Could we check that the transformation looks good instead of checking the debug output?
Put differently can we write a test that works for release as well?

gilr updated this revision to Diff 172961.Nov 7 2018, 8:27 AM

Test case for the two-combinations bug greatly reduced by bugpoint (thanks, Quentin!) and checks the generated IR instead of LSR's debug prints.

qcolombet accepted this revision.Nov 7 2018, 10:42 AM

Hi Gil,

Looks good to me.

Nitpicks inlined, no need for another round of review.

Cheers,
-Quentin

test/Transforms/LoopStrengthReduce/two-combinations-bug.ll
3 ↗(On Diff #172961)

adopted => adapted?

43 ↗(On Diff #172961)

Could you get rid of the implicit variable name? (%[0-9]+)
Those are painful when manually editing tests.

opt -instnamer does that for you, but given you have just one, you may want to do it by hand.

43 ↗(On Diff #172961)

Can we get rid of the metadata?
There is an opt -stripSomethingSomething if you don't want to do it manually.

This revision is now accepted and ready to land.Nov 7 2018, 10:42 AM
This revision was automatically updated to reflect the committed changes.