This is an archive of the discontinued LLVM Phabricator instance.

[ORCv2] - New Speculate Query Implementation
ClosedPublic

Authored by pree-jackie on Aug 18 2019, 2:16 PM.

Details

Summary

This patch introduces, SequenceBBQuery - new heuristic to find likely next callable functions it tries to find the blocks with calls in order of execution sequence of Blocks.

It still uses BlockFrequencyAnalysis to find high frequency blocks. For a handful of hottest blocks (plan to customize), the algorithm traverse and discovered the caller blocks along the way to Entry Basic Block and Exit Basic Block. It uses Block Hint, to stop traversing the already visited blocks in both direction. It implicitly assumes that once the block is visited during discovering entry or exit nodes, revisiting them again does not add much. It also branch probability info (cached result) to traverse only hot edges (planned to customize) from hot blocks. Without BPI, the algorithm mostly return's all the blocks in the CFG with calls.

It also changes the heuristic queries, so they don't maintain states. Hence it is safe to call from multiple threads.

It also implements, new instrumentation to avoid jumping into JIT on every call to the function with the help _orc_speculate.decision.block and _orc_speculate.block.

"Speculator Registration Mechanism is also changed" - kudos to @lhames

Open to review, mostly looking to change implementation of SequeceBBQuery heuristics with good data structure choices.

Diff Detail

Event Timeline

pree-jackie created this revision.Aug 18 2019, 2:16 PM
mgrang added inline comments.Aug 18 2019, 11:17 PM
llvm/lib/ExecutionEngine/Orc/SpeculateAnalyses.cpp
266

You can use the range-based version of llvm::sort here:

llvm::sort(BBFreqs);
pree-jackie marked an inline comment as done.Aug 19 2019, 1:12 PM
pree-jackie added inline comments.
llvm/lib/ExecutionEngine/Orc/SpeculateAnalyses.cpp
266

Thanks for the catch.

lhames added inline comments.Aug 19 2019, 2:29 PM
llvm/lib/ExecutionEngine/Orc/SpeculateAnalyses.cpp
172–177

I think this can be simplified to:

BlockHint.CallerBlock = llvm::find(CallerBlocks, AtBB) != CallerBlocks.end();
VisitedBlocks.insert(std::make_pair(AtBB, BlockHint));
248–249

Can you describe more about how SequencedBBQuery differs from BlockFreqQuery?

In particular: (1) What is the output of this algorithm, queryCFG, and (2) what benefit do we get from scanning to the exit blocks?

dblaikie added inline comments.Aug 19 2019, 2:32 PM
llvm/include/llvm/ExecutionEngine/Orc/SpeculateAnalyses.h
40

You can remove this 'private' since it's implicit

53

no need for 'private' here, when a class members are private by default

55

no need for 'public' in a struct where it's public by default

59

Perhaps use non-static data member initializers instead of a ctor here? (add initializers "= true", etc to the member declarations directly, and remove this ctor)

llvm/include/llvm/ExecutionEngine/Orc/Speculation.h
85–86
114

Does this dtor need to be explicitly declared? Or could this line be deleted?

llvm/lib/ExecutionEngine/Orc/SpeculateAnalyses.cpp
72–73

Does this capture any case that isn't captured by the all_of below? Perhaps this special case could be removed?

74

You can drop the "-> bool" part - it's implicit & probably clear enough from the implementation that the return type is bool.

75

"x ? true : false" seems redundant. Perhaps "return BB.getSingleSuccessor() != nullptr;" would be OK/better?

135–136

Remove else-after-return

146

Can probably use llvm::find(BBList, &Block) rather than having to pass begin/end here

149

*FItr == &Block here, right? Perhaps it'd be more clear to use &Block here, then - so it's clear there's nothing being accessed in BBList - it's only used for a lookup test?

(& maybe there's a neater way to write that - ah, llvm::is_contained?)

210–211

Remove else-after-return

222–224

Perhaps these two calls to VisitedBlocks could be coalesced?

if (llvm::is_contained(CallerBlocks, AtBB))
  BlockHint.CallerBlock = true;
VisitedBlocks.insert(std::make_pair(AtBB, BlockHint));
235

Should this use a range-based for loop?

239

Could this use a range-based for loop too?

269

Skip the "? true : false" here, just "return BBF.second > BBS.second;" - that's already a boolean expression.

281–282

Possibly using range-based for loop here (maybe using an arrayref sliced down to NHotBlocks)

pree-jackie marked an inline comment as done.Aug 19 2019, 3:39 PM
pree-jackie added inline comments.
llvm/lib/ExecutionEngine/Orc/SpeculateAnalyses.cpp
248–249

The blockfreqquery, only selects the blocks with highest execution frequencies. Based on the observation, often the blocks within the loop nests have high frequency. The idea here is to, traverse back towards the entry node and exit node to collect all the predecessor and successor blocks with call instructions respectively.
Here the standard hot branch probability (4/5) is used to guide traversal to entry and exit nodes but we
can customize the branch probability to define own hotness instead of 80%, to have better coverage.

Algorithm produces a set of basic blocks N, where N >= NHotBlocks,

that is when N = NHotBlocks, there are no predecessor & successor blocks with call inst.

when N > NHotBlocks, x = (N - NHotBlocks)
there are x number of predecessor & successor blocks with call inst which we miss in block freq query.

pree-jackie marked 2 inline comments as done.Aug 19 2019, 3:51 PM
pree-jackie added inline comments.
llvm/lib/ExecutionEngine/Orc/SpeculateAnalyses.cpp
248–249

This heuristic outputs a sequence of basic blocks,
for example : entry -> bb1.call -> bb2.call.
The last operation of the heuristic, rearranges the collected set of basic blocks by taking the distance from the entry node into account. In the above example, bb2.call may have high frequency than bb1.call, but based on distance from entry node they are swapped.

248–249

Plus, this sequence will be very helpful in future that when we are introducing custom speculate.points inside the function instead of just having them in top of CFG.
In essence, we can decide the appropriate position of speculate.points by knowing the sequence of blocks with call,

Plus, this is experimental try.

pree-jackie marked 17 inline comments as done.Aug 23 2019, 9:09 AM
pree-jackie added inline comments.
llvm/lib/ExecutionEngine/Orc/SpeculateAnalyses.cpp
239

No, I guess.

pree-jackie marked 6 inline comments as done.Aug 23 2019, 9:32 AM

I resolved N-1 comments. Plus, this change set implements the launching bulk speculative queries and locking the module before instrumenting.
I kindly request you to review the else-after-return resolved part of code, to check whether I understand it correctly.

Thanks.

lhames accepted this revision.Aug 24 2019, 10:03 AM

LGTM, pending else-after-return fixes.

llvm/include/llvm/ExecutionEngine/Orc/Speculation.h
86–88

This is not quite right.

"else after return" refers to the following idiom:

if (C) {
  // do stuff
  return;
} else {
  // do other stuff.
}

LLVM's style guide recommends that this be rewritten as:

if (C) {
  // do stuff
  return;
}
// do other stuff.

In this case the right fix was to replace:

if (It == GlobalSpecMap.end() || Iv.second == false)
  return;
else
  CandidateSet = It->getSecond();

with

if (It == GlobalSpecMap.end() || Iv.second == false)
  return;

CandidateSet = It->getSecond();
llvm/lib/ExecutionEngine/Orc/SpeculateAnalyses.cpp
134–138

To remove the else-after-return here, this should be rewritten from:

if (TotalBlocks == 1)
  return TotalBlocks;
else
  return TotalBlocks / 2;

to:

if (TotalBlocks == 1)
  return TotalBlocks;
return TotalBlocks / 2;
205–208

This can be rewritten as:

if (!Itr->second.Downward)
  return;
Itr->second.Downward = false;
This revision is now accepted and ready to land.Aug 24 2019, 10:03 AM

else-after-return fixes. Going to commit

Committing tomorrow as it is difficult to watch build bots continuously now.

This revision was automatically updated to reflect the committed changes.