This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Change InstCombineAddSub to not perform constant folding when there is an intermediate use of the source register.
AbandonedPublic

Authored by dancgr on Nov 19 2019, 2:00 PM.

Details

Reviewers
majnemer
spatel
Summary

The add/sub with constant folding does not provide substantial gains when there is an intermediate use of the initial result, resulting in the same number of instructions.

Also, this behaviour can interfere with other optimizations which sometimes can result in an extra instruction.

%a = load i32 ...
%b = load i32 ...

%1 = sub %a, 10
%2 = store %1
%3 = sub 0, %1
%4 = add %b, %3

-> The statement (%3 = sub 0, %1) will get folded as follows:

%1 = sub %a, 10
%2 = store %1
%3 = sub 0, %1 => %3 = sub 10, %a
%4 = add %b, %3

-> Preventing the combination of (%3 = sub 0, %1) and (%4 = add %b, %3) into (%3 = sub %b, %1)

Event Timeline

dancgr created this revision.Nov 19 2019, 2:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 19 2019, 2:00 PM
dancgr changed the visibility from "Public (No Login Required)" to "No One".Nov 19 2019, 2:00 PM
dancgr edited the summary of this revision. (Show Details)
dancgr changed the visibility from "No One" to "Custom Policy".
dancgr edited reviewers, added: majnemer, spatel; removed: amehsan.Nov 19 2019, 2:05 PM
dancgr changed the visibility from "Custom Policy" to "Public (No Login Required)".

-1 to adding more artificial one-use restrictions.
I agree there is a general problem of unability to see if
there exists a particular instruction already.

dancgr added a comment.EditedNov 20 2019, 12:04 PM

-1 to adding more artificial one-use restrictions.
I agree there is a general problem of unability to see if
there exists a particular instruction already.

I'm wondering why we should avoid one-use restrictions. Is there any discussion that I'm not aware of?

In this case, we only want to apply the folding if there is only one use of first expression. If there is any more than that, the folding won't give any improvement or cause issues with other optimizations. How would you suggest that I avoid this issue without checking for one-use?

-1 to adding more artificial one-use restrictions.
I agree there is a general problem of unability to see if
there exists a particular instruction already.

I'm wondering why we should avoid one-use restrictions. Is there any discussion that I'm not aware of?

I'm not aware of any concise write-up, but there are at least two things i can note:
Given

%t0 = add i32 %arg, 8
call void @use(i32 %t0)
%t1 = sub i32 2, %t0

That we currently transform into

%t0 = add i32 %arg, 8
call void @use(i32 %t0)
%t1 = sub i32 -6, %arg
  1. %t1 no longer depends on %t0, which means %t1 does not have to wait until %t0 finishes executing - reduced dependency chain, improved computation parallelizm
  2. We reduced use-count of %t0. There may be some other transform involving %t0 that was previously blocked because %t0 had other uses, which now may happen.

In this case, we only want to apply the folding if there is only one use of first expression.

More specifically, by introducing such limitation we no longer fold this pattern in tests,
which leaves extra instruction[s] which allows the motivational fold in the patch's description to happen.
(please add that test)

If there is any more than that, the folding won't give any improvement or cause issues with other optimizations.

How do we know that?

How would you suggest that I avoid this issue without checking for one-use?

I don't have a simple solution for the issue as described in the patch's description,
although i feel like noting that i believe i have seen similar cases myself.

This is generally impossible to solve with the existing instcombine design - it would need
to be something more similar to what VPlan is supposed to end up being - a list of
possible transforms that we can apply, from which we'd pick so that we end up with best results.

dancgr added a subscriber: amehsan.Nov 26 2019, 7:28 AM
dancgr updated this revision to Diff 231119.Nov 26 2019, 12:00 PM
  • [Transform][InstCombine] Add new test case for the specified scenario.
dancgr abandoned this revision.Dec 12 2019, 2:27 PM

How would you suggest that I avoid this issue without checking for one-use?

I don't have a simple solution for the issue as described in the patch's description,
although i feel like noting that i believe i have seen similar cases myself.

Err, i mean, other than ensuring that the fold you are looking for happens before this constant-folding.
But a bit of a warning: such fold reordering pretty much always exposes other missed folds that will need to be added first.