This is an archive of the discontinued LLVM Phabricator instance.

Add BFI in constanthoisting pass and do the hoisting selectively
ClosedPublic

Authored by wmi on Jan 20 2017, 12:02 PM.

Details

Summary

Existing constant hoisting pass will merge a group of contants in a small range and hoist the const materialization code to the common dominator of their uses.

However, if the uses are all in cold pathes, we may hoist the materialization code from cold pathes to a hot place. This may hurt performance. The patch simply introduces BFI to the pass and does the hoisting only when the common dominator is colder than the sum of execution freq of all the uses to be merged.

Now usage of BFI is guarded by an option consthoist-with-block-frequency which is off by default. I did the compile time evaluation using spec2006 when the option was on and the compile time increase was in noise range: about 0.5% on averange (ranged from -2% ~ 2% for the 21 C/C++ benchmarks).

Diff Detail

Repository
rL LLVM

Event Timeline

wmi created this revision.Jan 20 2017, 12:02 PM
davidxl edited edge metadata.Jan 21 2017, 6:24 PM

Using profile data to throttle transformations in canonicalization passs is generally frowned upon. This pass does not look like a canonicalization pass (other reviewers may have different opinions). Having said that, I also wonder if a more general profile driven code sinking pass (before CGP), similar to look sink is worth the effort.

lib/Transforms/Scalar/ConstantHoisting.cpp
259–271

This is a redundant check.

273

Since we have profile information, it might be worth to find the optimal solution to the insertion problem -- for a given set of use BBs, find a second set of BBs such that

  1. a BB in the original set is either included in the second set or dominated by a BB
  2. the total frequency of second set is minimized.

The constant materialization instructions are inserted in the BBs in the solution set.

wmi added inline comments.Jan 22 2017, 11:22 AM
lib/Transforms/Scalar/ConstantHoisting.cpp
259–271

Indeed.

273

Sure. I will implement that if it is ok to use BFI in consthoisting.

wmi updated this revision to Diff 86085.Jan 27 2017, 11:33 AM

Update the patch according to suggestion from David: using BFI to find the optimal solution to the insertion problem.

davidxl added inline comments.Jan 27 2017, 1:54 PM
lib/Transforms/Scalar/ConstantHoisting.cpp
164

assert !BBs.count(Entry) is true

170

It is simpler to document like this:

WorkSet includes any block 'BB' that is not strictly dominated by any other blocks in set 'BBs', and all nodes in the path in the dominator tree from Entry to 'BB'.

Also name WorkSet as Candidates?

178

In the first iteration, this is always false, so perhaps change the loop into a do-while:

bool isCandidate = false;
do {

Path.insert(Node);
 if (Node == Entry) {
      isCandidate = true;
      break;
 }
 Node = getIDom(Node);

} while (!BBs.count(Node));

201

early return when MadeChange == false?

231

NodesInBBs can not be true for Entry node.

234

Why return empty BBs?

wmi added inline comments.Jan 27 2017, 4:53 PM
lib/Transforms/Scalar/ConstantHoisting.cpp
164

Ok.

170

Ok, will fix it.

178

That looks simpler. Will fix it.

201

MadeChange == false here just means we didn't find a BB from BBs dominated by another one in BBs, but we may still do hoisting work in the following, like hoist a const materialization out of loop.

234

BBs return the locations where to insert hoisted const materialization. If we decide not to hoist anything, i.e., MadeChange is false, we will return empty BBs saying nothing will be inserted.

davidxl added inline comments.Jan 29 2017, 9:32 PM
lib/Transforms/Scalar/ConstantHoisting.cpp
230

The exit handling code can probably be simplified a little:

if (Node == Entry) {

   BBs.clear();
   if (InsertPtsFreq >= BFI.getBlockFreq(Node)) {
        BBs.insert(Entry);
         MadeChange = true;
    } else if (MadeChange) {
        BBs.insert(InsertPts.begin(), InsertPts.end());
    }
   break;
}
wmi updated this revision to Diff 86306.Jan 30 2017, 9:54 AM

Addressed David's comments.

davidxl added inline comments.Feb 1 2017, 4:52 PM
include/llvm/Transforms/Scalar/ConstantHoisting.h
130

Add a documentation about what it returns and what it means when empty set is returned.

lib/Transforms/Scalar/ConstantHoisting.cpp
696

It does not seem correct to skip base const materialization. Also the second condition should always be true?

wmi added inline comments.Feb 1 2017, 5:12 PM
lib/Transforms/Scalar/ConstantHoisting.cpp
696

You are right. I realize it is not right to skip base const materialization when we need to rebase some use. Will fix it.

The second condition is not necessarily true. We have multiple uses and multiple insertion points chosen. Every use will be dominated by exactly one insertion point but we don't know which one.

wmi updated this revision to Diff 86888.Feb 2 2017, 3:13 PM

Remove MadeChange. We will still need to insert const materialization code even if there is no hoisting or merging for BBs.

davidxl added inline comments.Feb 3 2017, 11:58 AM
lib/Transforms/Scalar/ConstantHoisting.cpp
673

I am not sure why this is needed. If nothing is changed, it should be fine to skip the transformation for the constant (as well as rebase in uses) completely. Without skipping, it basically force rebasing?

678

There are three cases:

  1. new insertion point introduced
  2. the BB originally has a base constant use
  3. the BB originally has a constant use but need rebase.

For case 1) and 3), new insertion is needed, but for 2) it can be skipped.

684

The right way to fix is to move the rebasing code out of the IPSet traversal loop after all new base constant materialization insertion are done. Only those that got dropped in findConstantInsertionPoint need to be handled.

davidxl added inline comments.Feb 3 2017, 12:17 PM
lib/Transforms/Scalar/ConstantHoisting.cpp
678

Correction: 2) can only be skipped when there are no other uses dominated by it -- so it is probably better to not to skip and rewrite the original use

684

Discard the second part of the last comment.

wmi added inline comments.Feb 3 2017, 5:16 PM
lib/Transforms/Scalar/ConstantHoisting.cpp
678

I thought it was not going to change the generated code if we do a rebase for a base constant use. That is why for the code simplicity, I choose to insert materialization and rebase code anyway.

It will change

BB:
%inc = add i64 %5, 214748364701

to

BB:
%const = bitcase i64 214748364701 to i64
...
%inc = add i64 %5, %const

Because 214748364701 is a very large number (The precondition it is treated as constant hoisting candidate) and it cannot be encoded in add instruction, we need the materialization code %const = ... anyway in the end.

davidxl added inline comments.Feb 5 2017, 2:48 PM
lib/Transforms/Scalar/ConstantHoisting.cpp
678

Ok makes sense.

694

Should !BFI check removed? It just improves compile time a little for the non-default path. To save compile time for more generally, do 'if (IPSet->size() == 1 || DT->dominates(...)) instead?

Also is it better to keep track of the base constants rewritten and assert at the end that all uses are rewritten.

wmi added inline comments.Feb 5 2017, 11:01 PM
lib/Transforms/Scalar/ConstantHoisting.cpp
694

Good suggestions. Fixed as suggested.

wmi updated this revision to Diff 87189.Feb 5 2017, 11:16 PM

Make change to save compile time and add assertion per David's suggestions.

davidxl accepted this revision.Apr 19 2017, 2:49 PM

lgtm

This revision is now accepted and ready to land.Apr 19 2017, 2:49 PM
This revision was automatically updated to reflect the committed changes.