This is an archive of the discontinued LLVM Phabricator instance.

LLVM support for BB-cluster sections
ClosedPublic

Authored by rahmanl on Mar 27 2020, 2:06 PM.

Details

Summary

This is an extension of basic block sections which allows emitting several hot basic blocks in the same section, in a given order.

Currently the -basicblock-sections=list option allows specifying unique sections for some basic blocks. For example:
!foo
!!1
!!2
!!4
instructs the compiler to emit the entry basic block and each of basic blocks #1, #2, and #4 into a separate unique section, while all the non-specified basic blocks are coalesced into a "cold" section.

With this patch, we can use the same option to specify clusters of basic blocks to be emitted into the same section. For instance:
!foo
!!0 1 2
!!4
means emitting one section containing the entry block, BB#1, and BB#2 in this order, while emitting BB#4 in a unique section of its own. Still all the excluded basic blocks go into the cold section.

One difference is that with the new approach, we don't always create a unique separate section for the entry block. It needs to be explicitly specified in the lists, or otherwise it would be coalesced into the cold section.

Another difference is with respect to the special exception section. We only create an exception section if the BB-cluster specification scatters EH pads into more than one section.

The final difference is with regards to the size directives. We now emit a size directive only for the BBs which start a section. That size is the size of the section marked by that BB. If we want sizes for every internal basic block as well, this requires being able to combine -basicblock-sections=labels with -basicblock-sections=list which is not currently possible.

Finally, with BB-clusters, we emit persistent non-temporary labels only for basic blocks which begin sections. Other basic blocks will use temp labels in the form of .LBBN_M as before.

The benefit of the BB-cluster approach is that it reduces the number of basic block sections created, which in turn reduces the CFI and DebugInfo overhead, and also reduces the burden of the linker.

Another benefit is being able to reuse the assembler's JCC mitigation strategy as discussed in http://lists.llvm.org/pipermail/llvm-dev/2020-March/140134.html.

We note the BB-cluster approach is only beneficial if the optimal block order can be computed prior to compilation, using profiles and the binary generated using -basicblock-secitons=labels We have shown that this is possible and the performance improvements match our previous results (please refer to the link above).

Diff Detail

Event Timeline

rahmanl created this revision.Mar 27 2020, 2:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 27 2020, 2:06 PM
rahmanl updated this revision to Diff 253221.Mar 27 2020, 2:17 PM

Added a test for bb-cluster sections: llvm/test/CodeGen/X86/basicblock-sections-clusters.ll

Took one pass over the code.

  • This needs more tests like with exception blocks kept in separate clusters (check if they are grouped)
llvm/include/llvm/CodeGen/MachineBasicBlock.h
82

Move this to MachineFunction.h as a member?

83

Also this?

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1199

Combine the two conditions.

3070

Good check, reduces unnecessary section directives if .S is emitted. Does the entry block begin a section? If so, it should be skipped.

llvm/lib/CodeGen/BBSectionsPrepare.cpp
12

Could make longer lines and keep it better formatted.

25

Longer lines?

140

Could be made a static function passing the Maps as const reference?

168

I see why you removed the sameSection check. However, when you call this function, you already know the clusterID and the position in the cluster. So, don't insert the fallthrough if they are the same cluster and are adjacent?

213

"Unless we want individual unique sections for every basic block...."

222

If clusterID is made a pair of integers which denote the ids of the start and end basic blocks, the begin and end function in MachineBasicBlock can be simplified.

llvm/lib/CodeGen/MachineBasicBlock.cpp
577

This comment applies to isEndSection too. Consider making beginSection and endSection bool members of MachineBasicBlock and set it once in BBSectionsPrepare. Isn't this function called multiple times per MBB, like for DebugInfo and CFI?

One way is to make ClusterID a pair of basic block ids that give you the start and end of the cluster. The beginSection is just "return id == clusterID.first" and end Section is "return id == clusterID.second"

rahmanl updated this revision to Diff 253992.Mar 31 2020, 2:04 PM

Major changes with this update:

1- Using a StringMap of vectors instead of nested map in getBBClusterInfo (BBSectionsPrepare.cpp).
2- Structured error handling in getBBClusterInfo.
3- New basicblock-section-clusters tests exercising various combinations of clustering (involving exception handling blocks and cold code), add ensuring that branches are optimized/inserted at the right places.
rahmanl marked 8 inline comments as done.Mar 31 2020, 2:23 PM

Thanks a lot for the comments @tmsriram . I managed to get most of them addressed (Most importantly the one about nested maps).

llvm/include/llvm/CodeGen/MachineBasicBlock.h
82

I thought about placing this in https://llvm.org/doxygen/MachineFunction_8h_source.html#l00110 but this is a property of MachineBasicBlock. In fact, MachineFunction does not need to care about section IDs at all.
Is it the "static" keyword which is concerning?

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
3070

Yes it does, but it is excluded from this block by pred_empty() check at line 3044.

llvm/lib/CodeGen/BBSectionsPrepare.cpp
168

I agree. It's possible. But the requirement for exception handling blocks might make it a bit complicated. Since this is before basic block sorting, we would have to resort to BBClusterInfo.PositionInCluster which may not represent the ultimate placement of blocks.

222

isBeginSection() and isEndSection() only look at previous and subsequent basic blocks to see if they have a different section ID. This way, we can use SectionIDs to decided if a basic block begins or ends a section without having to store to boolean.

rahmanl updated this revision to Diff 254381.Apr 1 2020, 5:59 PM

The major change int this update is the ability of "not inserting unconditional branches" when fallthrough block hasn't moved.

To do this, I have moved the logic of insertUnconditionalFallthroughBranch function into the updateBranches function.

rahmanl marked an inline comment as done and an inline comment as not done.Apr 1 2020, 6:08 PM

Finished addressing one more @tmsriram 's suggestion involving redundant insertion of explicit fallthrough branches.

tmsriram added inline comments.Apr 1 2020, 10:47 PM
llvm/lib/CodeGen/BBSectionsPrepare.cpp
180

s/accout/account

287

Does this need a set? could you make EHPadSections a integer = -1. Set it to the clusterid when you see the first exception MBB. After that, if you see another MBB with a different clusterid, set it to MachineBasicBlock::ExcpetionSectionID and check for this before resetting?

350

Wouldn't this a bit more clear:

return (XSectionID == ExceptionSectionID || XSectionID == ColdSectionID) ? X.getNumber() < Y.getNumber : ...PositionInCluster < ...PositionInCluster

The implicit ColdSectionID > ExceptionSectionID should be exposed unless they are part of enum I think.

364

I get why you are doing it here now. You want to generate the optimal branches when possible, something you couldn't do before.

388

Why not make it one expression?

411

I dont see any other uses of istringstream except in a tool. Why not just use .split(" ")?

llvm/lib/CodeGen/MIRPrinter.cpp
650

s/Excetion/Exception

tmsriram added inline comments.Apr 1 2020, 10:52 PM
llvm/lib/CodeGen/MachineBasicBlock.cpp
561

I don't have a strong opinion but the fact that this is being computed for every call of isBeginSection and isEndSection makes me a bit nervous. Why do this when BBSectionsPrepare can just set a bool? can always be done later if there are many calls to these functions, just putting it out here.

rahmanl updated this revision to Diff 254665.Apr 2 2020, 6:14 PM

Addressed @tmsriram 's second round of comments.

Thanks a lot Sri.

rahmanl marked 5 inline comments as done.Apr 2 2020, 6:17 PM

Addressed @tmsriram 's second round of comments.

tmsriram accepted this revision.Apr 2 2020, 7:10 PM

Please do not submit without an approval from @efriedma as they approved the original patch.

This LGTM with these changes.

llvm/lib/CodeGen/BBSectionsPrepare.cpp
290–291

You can also add EHPadsSectionID != ExceptionSectionID

331–332

EHPadsSectionID == MachineBasicBlock::ExceptionSectionID should be moved outside the loop.

llvm/lib/CodeGen/MachineBasicBlock.cpp
552–553

This has become too simple, maybe just delete the function or move the body to the header file itself?

llvm/lib/CodeGen/MachineFunction.cpp
373

This function is called twice and I highly doubt this will be called more than that. Once in BBSectionsPrepare.cpp and other in MIRParser. Should it be a member? You can repeat it in both places I think, it is small.

Also, you could initialize isBeginSection and isEndSection to false and just handle the true case maybe.

This revision is now accepted and ready to land.Apr 2 2020, 7:10 PM
rahmanl updated this revision to Diff 254911.Apr 3 2020, 2:10 PM

Moved assignBeginEndSections to BBSectionsPrepare.cpp and MIRParser.cpp.

rahmanl marked 5 inline comments as done.Apr 3 2020, 2:12 PM

Handled all of @tmsriram 's 3rd round of comments.

llvm/lib/CodeGen/MachineBasicBlock.cpp
552–553

Moved the definition to the header file.

rahmanl edited the summary of this revision. (Show Details)Apr 3 2020, 2:17 PM

Hi Eli @efriedma , I took the liberty to review and approve this patch as it was mostly a change to BBSectionsPrepare.cpp. Appreciate if you could also take a look as you approved the original patch, thanks!

rahmanl marked an inline comment as done.Apr 7 2020, 4:30 PM

@efriedma Would you please take a look at this patch? Any feedback is appreciated.

I'm happy with this at a high level; it's not really any more complicated than BasicBlock sections itself, and the added flexibility makes sense.

From the AsmPrinter's perspective, there isn't any difference between having BBSections disabled, vs. having every BB in the function assigned to the section with the same ID. You shouldn't really need to explicitly check whether
BBSections are enabled from the AsmPrinter; all the checks that do remain are suspicious.

Also along those lines, does the SectionID need to be Optional<unsigned>, as opposed to just being unsigned?

llvm/include/llvm/CodeGen/MachineBasicBlock.h
490

"this == MBB" check is redundant, I think.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1180

I don't like the way this if statement is written. The fundamental MCBinaryExpr::createSub() operation is shared, but the two operations aren't really related anymore, so the logic here is sort of twisted. I'd suggest pulling the createTempSymbol()/MCBinaryExpr::createSub()/emitELFSize() into a helper (maybe a lambda), and calling it from two separate if statements.

llvm/lib/CodeGen/MIRParser/MIParser.cpp
630

"unknown"

llvm/lib/CodeGen/MIRParser/MIRParser.cpp
51

Why are there two copies of assignBeginEndSections ?

rahmanl updated this revision to Diff 256146.Apr 8 2020, 4:53 PM
rahmanl edited the summary of this revision. (Show Details)
rahmanl added a subscriber: eli.friedman.

The major change in this diff is removing all "MF->hasBBSections()" calls from AsmPrinter.cpp per @eli.friedman 's suggestion.
Other changes:

Refactoring the size directive emission for basic block symbols.
Merging the function getNamedSectionForMachineBasicBlock into getSectionForMachineBasicBlock.
Re-allowing the emission of basic block alignment for BBSections.
Removing the duplicate assignBeginEndSection from MIRParser and declaring it as an external function defined in BBSectionsPrepare.cpp.
rahmanl marked 5 inline comments as done.Apr 8 2020, 5:00 PM

Thanks a lot for your insightful comments @eli.friedman. I think removing the hasBBSections() calls from AsmPrinter.cpp has made the code a lot better.

llvm/lib/CodeGen/MIRParser/MIRParser.cpp
51

This function is defined to set IsBeginSection and IsEndSection which are derived from SectionID fields and the MBB order of the function. So we just need to iterate over all MBBs once and assign the fields.

I initially had defined it in MachineFunction.h and then decided to duplicate the code in the two places it is required as per @tmsriram .

I now made the MIRParser's version and external function. Do you have a better suggestion?

efriedma added inline comments.Apr 9 2020, 1:03 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1217

It looks like getEndMCSymbol is unused? (And therefore setEndMCSymbol/EntrySectionEndMBB are dead code.)

llvm/lib/CodeGen/MIRParser/MIRParser.cpp
51

Oh, hmm, didn't follow the original conversation. I don't really understand the objection from @tmsriram to making it a member function; 10 lines with tricky boundary conditions is generally beyond my threshold of "copy-paste it".

llvm/lib/CodeGen/MachineBasicBlock.cpp
556

getSectionEndMBB is dead?

tmsriram added inline comments.Apr 9 2020, 5:31 PM
llvm/lib/CodeGen/MIRParser/MIRParser.cpp
51

I agree, Eli's recommendation is correct.

rahmanl updated this revision to Diff 256680.Apr 10 2020, 3:08 PM
rahmanl marked an inline comment as done.

The major change with this is a new struct called MBBSectionID to represent the basic block section IDs instead of a plain integer.
This makes the code cleaner (especially for the upcoming CFI and DebugInfo patches).

rahmanl marked 6 inline comments as done.Apr 10 2020, 3:20 PM

Thanks again @efriedma and @tmsriram.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1210–1213

Please excuse the nested if statement here. Our upcoming CFI and Exception patches will add one statement to each of the nested if blocks,

efriedma accepted this revision.Apr 10 2020, 4:01 PM

LGTM with one minor suggestion

llvm/lib/CodeGen/MIRPrinter.cpp
646

Maybe just print all non-zero SectionIDs, instead of checking hasBBSections()?

Hi, it seems that you already got some feedback. I will look at it soon.

llvm/lib/CodeGen/BBSectionsPrepare.cpp
340

You can use

if (XSectionID != YSectionID)
  return ...
if (... != ...)
  return ...
return ...PositionInCluster < ...;
rahmanl updated this revision to Diff 257046.Apr 13 2020, 11:55 AM

This final update applies @MaskRay and @efriedma's last suggestions.
Thanks to all the reviewers.

rahmanl updated this revision to Diff 257061.Apr 13 2020, 12:18 PM

Re-uploading the diff since I missed test files in the latest update.

This revision was automatically updated to reflect the committed changes.