This is an archive of the discontinued LLVM Phabricator instance.

Propeller: LLVM support for basic block sections (Base Patch - Part 2)
ClosedPublic

Authored by tmsriram on Jan 29 2020, 4:57 PM.

Details

Summary

This is part 2 of the base patch for LLVM Support of basic block sections. Part 1 is here: https://reviews.llvm.org/D68063

  • This patch supports creating sections for all basic blocks or a subset of basic blocks. The subset of basic blocks via a clang option -fbasicblock-sections=<list_file> which contains a list of functions and the basic block ids for which sections are desired.

Please see https://github.com/google/llvm-propeller/blob/plo-dev/Propeller_RFC.pdf for a detailed discussion on basic block sections.

Labeling Basic Blocks :

Every basic block is labelled with a unique symbol as this allows easy mapping of virtual addresses from PMU profiles back to the corresponding basic blocks. Since the number of basic blocks is large, the labeling bloats the symbol table sizes and the string table sizes significantly. While the binary size does increase this does not affect performance as the symbol table is not loaded in memory during run-timea, the string table size bloat is kept very minimal using a unary naming scheme that uses string suffix compression. The basic blocks for function foo are named "a.bb.foo", "aa.bb.foo", . . . This turns out to be very good for string table sizes and the bloat in the string table size for a very large binary is only 8 %.

Diff Detail

Event Timeline

tmsriram created this revision.Jan 29 2020, 4:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 29 2020, 4:57 PM
tmsriram updated this revision to Diff 241328.Jan 29 2020, 5:14 PM

Add a test to make sure basic block sections are created.

Can you add more test cases to cover things like bb labels, and different bb section types (cold, EH, unique etc). For cold section types, the merging should kick in etc.

llvm/lib/CodeGen/MachineFunction.cpp
371

Fix comment. Missing the verb

421

It is not clear from the code below that the same BB begins and ends it for unique sections.

tmsriram updated this revision to Diff 241596.Jan 30 2020, 3:05 PM

Rename "BasicBlockSections" to "BBSections".

tmsriram marked an inline comment as done.Jan 30 2020, 3:07 PM
tmsriram added inline comments.
llvm/lib/CodeGen/MachineFunction.cpp
421

For unique sections, this comes already set above in line 388 and 389.

davidxl added inline comments.Jan 30 2020, 3:49 PM
llvm/lib/CodeGen/MachineFunction.cpp
421

ok. Perhaps add a comment or ASSERT here.

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
774

Add a comment on '0' parameter : 0 /*blah ..*/,

786

Is the logic reversed? Can you add a comment in the source?

807

Why there is no getUniqueSectioNames check here as in other cases?

(Not a reviewer, so just my armchair critic: this patch touches a lot of code but the test is tiny. I am concerned that many code paths may be untested.)

tmsriram updated this revision to Diff 241847.Jan 31 2020, 4:38 PM

New tests for basicblock-sections=labels and basicblock-sections=<file>

tmsriram updated this revision to Diff 241849.Jan 31 2020, 5:11 PM

One more test to check if sections can be created only for a subset of basic blocks within the function.

tmsriram updated this revision to Diff 242121.Feb 3 2020, 10:19 AM

One more test to make sure exception sections are created as expected.

tmsriram updated this revision to Diff 242133.Feb 3 2020, 11:13 AM
tmsriram marked 5 inline comments as done.

Code refactor and address comments.

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
786

It is not reversed. With unique section names, the section for function main would be named .text.main.cold and will be unique across this module as main is a unique name. Without unique section names, this would be named just .text.cold and all cold sections in this module would be grouped in one section.

So, the function name must be appended to section only if -funique-section-names is off. I have refactored this code and combined two functions into one.

807

Good catch, fixed.

Also missing a test case on cold basic blocks.

llvm/lib/CodeGen/MachineBasicBlock.cpp
74

Is this the best encoding scheme? I suppose in many cases, a single byte is enough? (e.g. total number of blocks in a func is small). Or this allows better string compression?

llvm/test/CodeGen/X86/basicblock-sections-eh.ll
84 ↗(On Diff #242133)

Make test more strict to test landing pad blocks are in the same section?

llvm/test/CodeGen/X86/basicblock-sections-listbb.ll
35 ↗(On Diff #242133)

how to tell if this is the id 2 section?

tmsriram marked 3 inline comments as done.Feb 4 2020, 10:00 AM
tmsriram added inline comments.
llvm/lib/CodeGen/MachineBasicBlock.cpp
74

This works great for ".strtab" size and we tried a few other simple encoding schemes like "1.BB", "2.BB", etc. While the simple encoding schemes are small, what we have is much smaller. The ".strtab" bloat even when turned on for all basic blocks is only 8%. The symtab bloat remains the same irrespective of the encoding. Please note that symtab and strtab are not ALLOCed and do not affect performance.

llvm/test/CodeGen/X86/basicblock-sections-eh.ll
84 ↗(On Diff #242133)

The character "l" denotes if this is a landing pad or not. Here there is only one. I can modify it to have more landing pads.

llvm/test/CodeGen/X86/basicblock-sections-listbb.ll
35 ↗(On Diff #242133)

The id is the count of the number of characters before ".BB". For instance, this is "r.BB._Z3bazb" and is the "id 1" basic block. That's why we explicitly check in the line before that this does not get a section. The entry block is "id 0'.

tmsriram updated this revision to Diff 242719.Feb 5 2020, 12:00 PM

Add another test to check for cold sections.

I'm afraid the way this patch is written, we're going to have to substantially rewrite it to port to other targets. In particular, we're assigning blocks to sections in the asmprinter, which is way too late for targets that do branch relaxation as an MIR pass.

Is there some reason to emit "cold" and "exception" blocks into separate sections? Not sure what benefit you get from separating them.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
3030

I'm confused what MF->front().isExceptionSection() is supposed to check. The entry block can't be an exception handler, I think?

3042

I don't think the linker treats the ".cold" suffix specially. Maybe we should do something different? (There's special support in binutils ld for the prefix ".text.unlikely."; not sure how useful that would be in this context.)

llvm/lib/CodeGen/MachineFunction.cpp
362

Can this actually get called more than once?

Why do we care if the blocks are sorted? Does it help optimize the code somehow?

rahmanl added inline comments.Feb 11 2020, 3:03 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
3030

Great catch. This block should be removed. We'll remove it in the next update.

I'm afraid the way this patch is written, we're going to have to substantially rewrite it to port to other targets. In particular, we're assigning blocks to sections in the asmprinter, which is way too late for targets that do branch relaxation as an MIR pass.

Hi Eli, What do you suggest we do here? An MIR pass that assigns sections to basic blocks?

Is there some reason to emit "cold" and "exception" blocks into separate sections? Not sure what benefit you get from separating them.

Sections containing hot blocks can be grouped together, function splitting of cold basic blocks and placing all the cold blocks together and away from the hot blocks gives us better iTLB utilization.

I'm afraid the way this patch is written, we're going to have to substantially rewrite it to port to other targets. In particular, we're assigning blocks to sections in the asmprinter, which is way too late for targets that do branch relaxation as an MIR pass.

Hi Eli, What do you suggest we do here? An MIR pass that assigns sections to basic blocks?

Yes, that would make sense. Not completely sure where it should run, but we can adjust that later. Maybe we want it to happen before MachineBlockPlacement, so we can do section-aware block placement?

Is there some reason to emit "cold" and "exception" blocks into separate sections? Not sure what benefit you get from separating them.

Sections containing hot blocks can be grouped together, function splitting of cold basic blocks and placing all the cold blocks together and away from the hot blocks gives us better iTLB utilization.

I meant, is there some reason to emit "cold" and "exception" sections separately from each other. I understand the benefit of separating both of them from the hot blocks.

I'm afraid the way this patch is written, we're going to have to substantially rewrite it to port to other targets. In particular, we're assigning blocks to sections in the asmprinter, which is way too late for targets that do branch relaxation as an MIR pass.

Hi Eli, What do you suggest we do here? An MIR pass that assigns sections to basic blocks?

Yes, that would make sense. Not completely sure where it should run, but we can adjust that later. Maybe we want it to happen before MachineBlockPlacement, so we can do section-aware block placement?

Is there some reason to emit "cold" and "exception" blocks into separate sections? Not sure what benefit you get from separating them.

Sections containing hot blocks can be grouped together, function splitting of cold basic blocks and placing all the cold blocks together and away from the hot blocks gives us better iTLB utilization.

I meant, is there some reason to emit "cold" and "exception" sections separately from each other. I understand the benefit of separating both of them from the hot blocks.

Apologies, I misread the original question. Yes, in cases where exception blocks are hot, it does not help placing them along with cold blocks. One of the SPEC benchmarks had a performance improvement doing this.

Apologies, I misread the original question. Yes, in cases where exception blocks are hot, it does not help placing them along with cold blocks. One of the SPEC benchmarks had a performance improvement doing this.

Oh, okay. And I guess you can't emit multiple exception sections due to some arcane exception-handling stuff? (I remember you mentioned this at some point, but I don't remember the details.) Therefore the compromise is emitting them separately from the cold blocks? I guess that's reasonable.

Is there some reason to emit "cold" and "exception" blocks into separate sections? Not sure what benefit you get from separating them.

Great point Eli!
As @tmsriram mentioned, in the case where hot eh pads are present, this allows us to reorder the exception section and place it in the hot part.
However, I modified the code such that if all eh pads are cold, we don't create a separate exception section and group all eh pads together with the cold section. The next patch update by @tmsriram will include this change. A test will also be included in D73739.

tmsriram updated this revision to Diff 246606.Feb 25 2020, 5:26 PM

Create a new pass and assign sections to Machine Basic Blocks in this pass.

Address reviewer comments, one more test.

I'm afraid the way this patch is written, we're going to have to substantially rewrite it to port to other targets. In particular, we're assigning blocks to sections in the asmprinter, which is way too late for targets that do branch relaxation as an MIR pass.

Hi Eli, What do you suggest we do here? An MIR pass that assigns sections to basic blocks?

Yes, that would make sense. Not completely sure where it should run, but we can adjust that later. Maybe we want it to happen before MachineBlockPlacement, so we can do section-aware block placement?

Created a new pass, BBSectionsPrepare.cpp that does all the work of assigning sections to MBBs. This removes all code from CodeGenPrepare.cpp and Function.h.

Is there some reason to emit "cold" and "exception" blocks into separate sections? Not sure what benefit you get from separating them.

Sections containing hot blocks can be grouped together, function splitting of cold basic blocks and placing all the cold blocks together and away from the hot blocks gives us better iTLB utilization.

I meant, is there some reason to emit "cold" and "exception" sections separately from each other. I understand the benefit of separating both of them from the hot blocks.

Apologies, I misread the original question. Yes, in cases where exception blocks are hot, it does not help placing them along with cold blocks. One of the SPEC benchmarks had a performance improvement doing this.

tmsriram marked 4 inline comments as done.Feb 25 2020, 5:32 PM
tmsriram added inline comments.
llvm/lib/CodeGen/MachineFunction.cpp
362

Having basic blocks in the same section together makes it really easy to update CFI and DebugInfo (like label differences). It also makes the .s more readable.

efriedma added inline comments.Feb 25 2020, 6:10 PM
llvm/include/llvm/CodeGen/MachineBasicBlock.h
133

I'd like to have MIR support for serializing/deserializing the SectionType, so we can test the BBSectionsPrepare pass separately from the passes that run after it and the asmprinter.

tmsriram marked an inline comment as done.Feb 25 2020, 9:28 PM
tmsriram added inline comments.
llvm/include/llvm/CodeGen/MachineBasicBlock.h
133
  • Function attribute is one way I suppose but if the sections need to be created only for a subset of basic blocks then we would need to emit metadata?
  • I thought of emitting a function attribute like "bbsections" with a string value which encodes the basic block ids that require sections.

Thoughts?

efriedma added inline comments.Feb 26 2020, 5:41 PM
llvm/include/llvm/CodeGen/MachineBasicBlock.h
133

We can print/parse data for MachineBasicBlocks directly. We do this for other bits of data stored on MachineBasicBlock. See MIPrinter::print(const MachineBasicBlock &MBB).

tmsriram updated this revision to Diff 247545.Mar 1 2020, 8:40 PM

Add support for serializing Basic Block Section Type.

  • Print machine basic block section type with MIR Printer
  • Parse Section Type from MIR
  • Add tests for -stop-after=bbsections-prepare and -start-after=bbsections-prepare to test bbsections-prepare pass.
efriedma accepted this revision.Mar 2 2020, 2:52 PM

LGTM with one minor comment.

llvm/lib/CodeGen/MIRParser/MIParser.cpp
638 ↗(On Diff #247545)

error()? This code is pretty clearly reachable.

This revision is now accepted and ready to land.Mar 2 2020, 2:52 PM
tmsriram updated this revision to Diff 248366.Mar 4 2020, 6:41 PM
tmsriram marked an inline comment as done.

Rebase, address reviewer comments.

  • Change llvm_unreachable to error in MIParser.cpp, make getMBBS return bool.
tmsriram updated this revision to Diff 249523.Mar 10 2020, 5:02 PM

Rebase

  • Change to use MemoryBuffer rather than string for bbsections file
  • Detailed comments in BBSectionsPrepare.cpp on basic block sections
This revision was automatically updated to reflect the committed changes.