Page MenuHomePhabricator

[MachineSink] add one more profitable pattern for sinking
Needs ReviewPublic

Authored by shchenz on Tue, Sep 1, 3:34 AM.

Details

Reviewers
MatzeB
echristo
efriedma
qcolombet
Group Reviewers
Restricted Project
Summary

This patch adds one profitable pattern to sink MI when MI->parent() dominates SuccToSinkTo and SuccToSinkTo post dominates MI->parent().

If MI is in loop and MI is:

def0, def1, defn = opcode, use0, use1, use2, use3

sinking MI to SuccToSinkTo will shorten the live range for def but will enlarge the live range for use.
But in loop, the use may be defined outside of loop, so if SuccToSinkTo is still in same loop, sinking MI should have no impact for the live range for the uses which are defined out of loop.

If we know that all the uses are defined outside of loop, it is still profitable to sink.

Diff Detail

Event Timeline

shchenz created this revision.Tue, Sep 1, 3:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptTue, Sep 1, 3:34 AM
shchenz requested review of this revision.Tue, Sep 1, 3:34 AM
shchenz edited the summary of this revision. (Show Details)Tue, Sep 1, 3:43 AM
qcolombet requested changes to this revision.Wed, Sep 2, 12:18 PM
qcolombet added inline comments.
llvm/lib/CodeGen/MachineSink.cpp
603

Could you invert the logic to early exit and reduce the indentation here?

611

Reg == 0, should be fine. We should just skip them.

621

Instead of counting the number of live-ranges, we should probably count their weights with respect to the related register pressure set.

E.g., <smallDef>, <smallDef> = op <BigUse>

Maybe each smallDef accounts for 1 register whereas BigUse accounts for 4 registers.

This revision now requires changes to proceed.Wed, Sep 2, 12:18 PM
shchenz edited the summary of this revision. (Show Details)Wed, Sep 2, 6:04 PM
shchenz updated this revision to Diff 290160.Sun, Sep 6, 6:46 PM

use registerclassweight

shchenz marked 2 inline comments as done.Sun, Sep 6, 6:46 PM
shchenz added inline comments.
llvm/lib/CodeGen/MachineSink.cpp
621

Good idea!

gentle ping

Hi,

The update with the RegClassWeight is only partially done IMHO.
Each weight should be added/removed from the related register pressure sets. You can take a look at MachineLICM.cpp to see how to use these.

I.e., even if the total cost decreases, this doesn't guarantee that we won't go over the limit of a pressure set and thus, we would spill.
E.g.,

loop:
   smallDef =
   ... = smallDef
   bigDef = smallDef
   // lot of small defs.
  condbr loop

Sinking bigDef would be seen profitable since the weight of bigDef is greater than smallDef. However, extending smallDef will make it overlaps with a lot of small defs and may create spilling (remember that smallDef and bigDef could be on different register banks.)

For now I would suggest to either:

  1. Implement the support of pressure sets, or
  2. Only do the transformation when the def of the uses are outside the loop. Like you said, these won't have any impact on the register pressure.

Random though: currently MachineLICM only does hoisting. I think it would make sense to have it also do sinking.
Put differently, conceptually, given we are looking at loop invariant here, it would make sense to have this code in MachineLICM instead of MachineSink.

What do you think?

Cheers,
-Quentin

@qcolombet Thanks for your good comments.

given we are looking at loop invariant here, it would make sense to have this code in MachineLICM instead of MachineSink

MachineLICM tries to move loop invariant instructions to outside of a loop(preheader of exit block). Here we want to move def closer to its use inside the loop.
If we want to support 'real' sink in MachineLICM(move loop invariant instructions to the exit block), it should always benefit for the loop's register pressure?

If we want to move def closer to its use inside the loop to benefit register pressure in MachineLICM pass, except many duplicated codes with machinesink pass, (condition checks in SinkInstruction), register estimate model in machineLICM also does not support this. I checked UpdateBackTraceRegPressure, seems now the model is designed for hoisting only. It just stores the register pressure on path from loop preheader to current BB. see RegPressure and calcRegisterCost(Sinking should have a different cost calculation). For sinking, we need to estimate the register pressure in the destination block which is not on path from loop preheader to current BB.

The real world case I met is not a loop invariant instruction, it has an operand which is a user of PHI in loop header.

So I think maybe we should implement this in machinesink pass?

But yes, the register pressure estimate is not accurate for now and your example shows this very well. If we want to estimate it accurately, maybe we can use struct RegisterPressure in RegisterPressure.h to check register pressure of the destinate BB after the sinking? If the register pressure for register pressure set exceeds the limit after the sinking, we don't sink. What do you think? @qcolombet

Thanks again for your comments.

The real world case I met is not a loop invariant instruction, it has an operand which is a user of PHI in loop header.

I see, yes, then having this in MachineSink makes sense. At least as a start :).

struct RegisterPressure is a good start, though I don't remember how it ties with the register pressure sets (TargetRegisterInfo::getRegClassPressureSets), which is ultimately what we want to use.

Given adding that is going to take some time, you could push your patch for the cases we know are always profitable: when the uses are alive across the whole loop (line 636 in the current implementation.)

Cheers,
-Quentin

shchenz updated this revision to Diff 292103.EditedTue, Sep 15, 9:45 PM

address @qcolombet comments:
1: firstly add the always profitable pattern

Given adding that is going to take some time, you could push your patch for the cases we know are always profitable: when the uses are alive across the whole loop (line 636 in the current implementation.)

Thanks for this suggestion, updated this patch accordingly.
As expected, the improvement for the benchmark is gone now. : (
Will do the left part in following patch.

shchenz edited the summary of this revision. (Show Details)Tue, Sep 15, 9:52 PM

Hi,

As expected, the improvement for the benchmark is gone now. : (

Too bad. Hopefully it would take long to come up with the follow up path.

The source change looks good to me (there's one lint report that needs fixing though: one of the comment is larger than 80-col), but we miss a test case.

Could you add a test case that capture specifically what this patch is doing?
In other words, a test case that runs the machine sink pass alone and that fails without that patch and passes with that patch.

You can produce the skeleton of the test by running -stop-before machine-sink -simplify-mir, then you can use that output as input of your test with -run-pass=machine-sink.

Cheers,
-Quentin

shchenz updated this revision to Diff 292401.Wed, Sep 16, 7:26 PM

1: add one more mir test case
2: fix code format

Thanks very much, updated, could you please help to have another look? @qcolombet

Hi @shchenz ,

1: add one more mir test case

Did you forget to do git add, I don't see it.

Cheers,
-Quentin

llvm/lib/CodeGen/MachineSink.cpp
622

This variable is a left over of the previous version of the patch.
Please delete it.

shchenz updated this revision to Diff 292672.Thu, Sep 17, 5:17 PM

delete unused variables

shchenz marked an inline comment as done.Thu, Sep 17, 5:24 PM

1: add one more mir test case

Did you forget to do git add, I don't see it.

Weird, I added case test/CodeGen/PowerPC/sink-down-more-instructions-1.mir. Could you do another check?
Maybe we should delete the IR test test/CodeGen/PowerPC/sink-down-more-instructions.ll? it is same input with mir test.