This is an archive of the discontinued LLVM Phabricator instance.

Let -basic-block-sections=labels emit basicblock metadata in a new .bb_addr_map section, instead of emitting special unary-encoded symbols.
ClosedPublic

Authored by rahmanl on Aug 6 2020, 12:51 AM.

Details

Summary

This patch introduces the new .bb_addr_map section feature which allows us to emit the bits needed for mapping binary profiles to basic blocks into a separate section.
The format of the emitted data is represented as follows. It includes a header for every function:

Address of the function-> 8 bytes (pointer size)
Number of basic blocks in this function (>0)-> ULEB128

The header is followed by a BB record for every basic block. These records are ordered in the same order as MachineBasicBlocks are placed in the function. Each BB Info is structured as follows:

Offset of the basic block relative to function begin-> ULEB128
Binary size of the basic block-> ULEB128
BB metadata-> ULEB128 [ MBB.isReturn() OR MBB.hasTailCall() << 1 OR MBB.isEHPad() << 2 ]

The new feature will replace the existing "BB labels" functionality with -basic-block-sections=labels.
The .bb_addr_map section scrubs the specially-encoded BB symbols from the binary and makes it friendly to profilers and debuggers.
Furthermore, the new feature reduces the binary size overhead from 70% bloat to only 12%.

For more information and results please refer to the RFC: https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html

Diff Detail

Event Timeline

rahmanl created this revision.Aug 6 2020, 12:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 6 2020, 12:51 AM
rahmanl updated this revision to Diff 285826.Aug 15 2020, 2:55 AM

Modify clang test.

rahmanl updated this revision to Diff 285828.Aug 15 2020, 2:58 AM

Add clang test.

rahmanl updated this revision to Diff 285830.Aug 15 2020, 3:08 AM

Modify the clang test.

Herald added a project: Restricted Project. · View Herald TranscriptAug 15 2020, 3:08 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
rahmanl edited the summary of this revision. (Show Details)Aug 15 2020, 10:19 AM
rahmanl added reviewers: tmsriram, shenhan, efriedma, MaskRay.
rahmanl added a subscriber: amharc.
snehasish added inline comments.Aug 15 2020, 12:11 PM
llvm/include/llvm/CodeGen/MachineBasicBlock.h
481

Document the metadata format? AFAICT the only to find out now is reading the code which generates the metadata as unsigned.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1027

I have a slight preference to move this check to the callsite so that the reader does not have to follow it only to realize that their function of interest does not emit a bb info section. Though I see the convention in the code is to call it and then return early. Maybe documenting at the callsite is enough?

1318

Add a comment here.

llvm/lib/CodeGen/MachineBasicBlock.cpp
61

If this block is empty should we even be calling this function?

llvm/lib/MC/MCObjectFileInfo.cpp
958

Is this just returning the BBInfoSection pointer which is null? If so then prefer returning nullptr here to make it clear to the reader.

969

Can we call the section something other than bb_info? For example when examining the contents of a binary a name like "stack_sizes" provides some obvious meaning without further investigation. Similarly, can we call this something like "bb_addr_offset_map"?

llvm/test/CodeGen/X86/basic-block-sections-labels-functions-sections.ll
2

Can we make the triples that we use consistent across the various tests?

6

Lets align the contents of the check to make it easier to inspect.

16

Consider moving the CHECK lines into the functions themselves to make it clear which CHECKS correspond to which functions.

rahmanl updated this revision to Diff 285926.Aug 17 2020, 12:50 AM

Address Snehasish's comments.

rahmanl edited the summary of this revision. (Show Details)Aug 17 2020, 7:29 AM
rahmanl updated this revision to Diff 286044.Aug 17 2020, 9:17 AM
rahmanl marked 3 inline comments as done.
  • Address Snehasish's comments. clang-format.
rahmanl updated this revision to Diff 286046.Aug 17 2020, 9:20 AM
  • Address Snehasish's comments.
llvm/lib/CodeGen/MachineBasicBlock.cpp
61
rahmanl marked 6 inline comments as done.Aug 17 2020, 9:22 AM

Thanks for the great feedback @snehasishk.

llvm/lib/MC/MCObjectFileInfo.cpp
969

Great suggestion. I renamed the section to bb_addr_map and also changed the function names accordingly.

  • The summary of this patch must be enhanced a little bit more to capture the highlights.
  • Even though you point to the RFC, I think just add a little more.
  • I would add that this patch basically un-pollutes the symtab, that is the greatest contribution to me which helps profilers, debuggers plug and play

Independently, what did we decide about the option? Are we keeping the option?

You should also include changes to the clang user manual to reflect this, documentation around -fbasic-block-section=labels can go in the same patch

clang/test/CodeGen/basic-block-sections.c
31–91

We need a test where we disasm the native object file and show how the contents look like. This is good for readability of these sections.

llvm/include/llvm/CodeGen/MachineBasicBlock.h
481

+1 Let's document this heavily where possible:

  • Under what options is this section emitted?
  • What is captured here?
  • Maybe a quick note on why this is used, just one sentence should be fine.
486

Typo.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1027

Ya, I like moving the check to the call site. You can assert here instead and be done.

1049

Can you add a comment here that basic block size cannot always be derived from label offsets due to alignment? Isnt that the only reason, otherwise size is redundant right?

1225

I am surprised we are not already doing this :). This is needed for your patch right? Otherwise, this needs to go as a separate patch.

llvm/lib/CodeGen/MachineBasicBlock.cpp
73

I am happy this code is blown away! :)

llvm/lib/CodeGen/MachineFunction.cpp
343–344

Ditto! Happy to see this gone!

llvm/lib/MC/MCObjectFileInfo.cpp
957

Run clang-format on the patches.

rahmanl updated this revision to Diff 288058.Aug 26 2020, 11:49 AM
rahmanl marked an inline comment as done.

Address Sri's comments.

rahmanl updated this revision to Diff 288064.Aug 26 2020, 11:58 AM

Address Sri's comments.

rahmanl marked 7 inline comments as done.Aug 26 2020, 12:04 PM
rahmanl added inline comments.
clang/test/CodeGen/basic-block-sections.c
31–91

The test you're envisioning won't look great.
If we want to dis-assemble the object file, we end up just having binary contents for every .bb_addr_map section.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1225

What's happening here is that the lambda function emitELFSizeDirective is removed since we no longer have two use-cases (only sections).

llvm/lib/MC/MCObjectFileInfo.cpp
957

Apparently, this is the formatting chosen by clang-format.

rahmanl edited the summary of this revision. (Show Details)Aug 26 2020, 12:22 PM
rahmanl marked 3 inline comments as done.
rahmanl published this revision for review.Aug 26 2020, 12:37 PM
rahmanl retitled this revision from Let -basic-block-sections=labels emit basicblock metadata in a new .bb_info section, instead of emitting special unary-encoded symbols. to Let -basic-block-sections=labels emit basicblock metadata in a new .bb_addr_map section, instead of emitting special unary-encoded symbols..
wenlei added a subscriber: hoy.Aug 26 2020, 12:40 PM
rahmanl updated this revision to Diff 288104.Aug 26 2020, 1:38 PM
rahmanl edited the summary of this revision. (Show Details)

Fix failing clang test which used to check for the old ".bb_info" name.

rahmanl updated this revision to Diff 288105.Aug 26 2020, 1:39 PM

Fix failing clang test which used to check for the old ".bb_info" name.

snehasish added inline comments.Aug 26 2020, 2:11 PM
clang/docs/UsersManual.rst
1706 ↗(On Diff #288105)

nit: s/would include/includes/

1707 ↗(On Diff #288105)

nit: s/relative to the/relative to the parent/

llvm/lib/CodeGen/MachineBasicBlock.cpp
58

I think this method only relies on already public methods of MachineBasicBlock. It would be cleaner to move this from here to a static helper function in AsmPrinter.cpp. This way we don't add yet another method to the MachineBasicBlock class and keeps the relevant metadata generation logic close to where it is used.

Feel free to ignore if you plan on calling this method elsewhere apart from AsmPrinter.cpp but I can't think of a reason to do so.

llvm/test/CodeGen/X86/basic-block-sections-labels.ll
5–9

I think this can be removed and the branch on L10 can directly use the param like so:

define void @_Z3bazb(i1 zeroext %0) personality i32 (...)* @__gxx_personality_v0 {
  br i1 %0, label %6, label %11
  ....
}

You will need to adjust the respective virtual register names since %2-%5 is no longer used.

rahmanl updated this revision to Diff 288158.Aug 26 2020, 7:08 PM
rahmanl marked 4 inline comments as done.

Address Snehasish's comments.

  • Move getBBAddrMetadata to AsmPrinter as a static function in AsmPrinter
  • Remove unnecessary IR instructions in basic-block-sections-labels.ll
rahmanl added inline comments.Aug 26 2020, 7:09 PM
llvm/lib/CodeGen/MachineBasicBlock.cpp
58

Thanks for the suggestion. It makes the code much more readable since I just place that function above emitBBAddrMapSection.
I can even make it a lambda function within emitBBAddrMapSection. WDYT?

llvm/test/CodeGen/X86/basic-block-sections-labels.ll
5–9

Great suggestion. Thanks. This was copied over from other basic-block-sections tests, so maybe they can be updated too.

snehasish added inline comments.Aug 26 2020, 7:19 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1030

Did you intend to change the formatting of the comment? I think the list items were on separate lines.

llvm/lib/CodeGen/MachineBasicBlock.cpp
58

I have a slight preference for the static method vs inline lambda since the comment describing the metadata can be placed above the static method.

rahmanl updated this revision to Diff 288694.Aug 28 2020, 1:30 PM
  • Move getBBAddrMetadata to AsmPrinter as a static function + nits.
rahmanl marked an inline comment as done.Aug 28 2020, 1:33 PM
rahmanl added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1030

My bad. Didn't fully check the lint result.

llvm/lib/CodeGen/MachineBasicBlock.cpp
58

Makes sense.

rahmanl edited reviewers, added: craig.topper; removed: tmsriram, shenhan.Aug 28 2020, 1:45 PM
rahmanl added a subscriber: shenhan.
snehasish accepted this revision.Aug 28 2020, 4:06 PM

LGTM. Please wait a bit for additional comments from others.

This revision is now accepted and ready to land.Aug 28 2020, 4:06 PM
MaskRay added inline comments.Aug 31 2020, 12:02 PM
clang/docs/UsersManual.rst
1703 ↗(On Diff #288694)

Use backquotes like below

clang/test/CodeGen/basic-block-sections.c
3–4

The triple change is unneeded. You can also just use -triple x86_64 because the behavior applies to generic ELF.

MaskRay added inline comments.Aug 31 2020, 12:06 PM
llvm/lib/CodeGen/MachineBasicBlock.cpp
84
llvm/test/CodeGen/X86/basic-block-sections-labels-functions-sections.ll
34

Can you use a more specific pattern than .+? It is difficult to know which pattern it uses (a larger issue is that it matches spaces and possibly leverages a subtle property of FileCheck)

rahmanl updated this revision to Diff 289498.Sep 2 2020, 9:40 AM
rahmanl marked 5 inline comments as done.
  • Address @MaskRay's comments.
  • Rebase with upstream.

Thanks a lot for the comments @MaskRay.

MaskRay added a subscriber: rsmith.Sep 2 2020, 12:37 PM
MaskRay added inline comments.
clang/test/CodeGen/basic-block-sections.c
31–90

Assembly tests are not common in clang/test but @rsmith LGTMed D68049 so I think it should be fine. Now this file tests more machine level stuff. Should the file switch to LLVM IR?

MaskRay requested changes to this revision.EditedSep 2 2020, 12:50 PM

I am still reading the patch, but I have noticed one thing worth discussing. .bb_addr_map is a non-SHF_ALLOC section (meaning that it is not part of the memory image). An absolute relocation type (.quad .Lfunc_begin0) works but the value is a link-time address, not taking account of the load base (PIE/shared object)). (If .bb_addr_map has the SHF_ALLOC flag, linkers will report a text relocation) How do you intend to use .bb_addr_map at runtime?

This revision now requires changes to proceed.Sep 2 2020, 12:50 PM

I am still reading the patch, but I have noticed one thing worth discussing. .bb_addr_map is a non-SHF_ALLOC section (meaning that it is not part of the memory image). An absolute relocation type (.quad .Lfunc_begin0) works but the value is a link-time address, not taking account of the load base (PIE/shared object)). (If .bb_addr_map has the SHF_ALLOC flag, linkers will report a text relocation) How do you intend to use .bb_addr_map at runtime?

bb_addr_map is intentionally not loaded as it is never meant to be used at run-time. bb_addr_map will be later be used in conjunction with a hardware profile of the binary to associate executed basic block addresses to machine basic blocks. The hardware profile of the binary has enough info to determine the load base address with PIE, this is pretty straight forward and multiple strategies exist. What exactly is the concern here?

I am still reading the patch, but I have noticed one thing worth discussing. .bb_addr_map is a non-SHF_ALLOC section (meaning that it is not part of the memory image). An absolute relocation type (.quad .Lfunc_begin0) works but the value is a link-time address, not taking account of the load base (PIE/shared object)). (If .bb_addr_map has the SHF_ALLOC flag, linkers will report a text relocation) How do you intend to use .bb_addr_map at runtime?

Regarding "not taking account of the load base (PIE/shared object)" - the profile conversion tool uses addresses in .bb_addr_map section and MMAP entries in perf.data file to construct runtime addresses, so these runtime addresses can be matched with the profile addresses recorded in perf.data file.

MaskRay added a comment.EditedSep 2 2020, 2:14 PM

I am still reading the patch, but I have noticed one thing worth discussing. .bb_addr_map is a non-SHF_ALLOC section (meaning that it is not part of the memory image). An absolute relocation type (.quad .Lfunc_begin0) works but the value is a link-time address, not taking account of the load base (PIE/shared object)). (If .bb_addr_map has the SHF_ALLOC flag, linkers will report a text relocation) How do you intend to use .bb_addr_map at runtime?

bb_addr_map is intentionally not loaded as it is never meant to be used at run-time. bb_addr_map will be later be used in conjunction with a hardware profile of the binary to associate executed basic block addresses to machine basic blocks. The hardware profile of the binary has enough info to determine the load base address with PIE, this is pretty straight forward and multiple strategies exist. What exactly is the concern here?

Thanks for the explanation! I guessed that non-SHF_ALLOC is intended but hoped that it can be confirmed (and you did!). This sections works similar to .stack_sizes emitted by clang -fstack-size-section.

This is important because for SHF_ALLOC, SHF_WRITE is also needed to avoid text relocations. I am still reading the patch to ensure that it will work as intended.

llvm/test/CodeGen/X86/basic-block-sections-labels-functions-sections.ll
22

Consider align the content. Since CHECK-LABEL: is the longest.

; CHECK:        .section .text._Z3foov,"ax",@progbits
; CHECK-LABEL:	_Z3foov:
; CHECK-NEXT:     [[FOO_BEGIN:.Lfunc_begin[0-9]+]]:

Non-labels have additional indentation to make it clear they are bodies covered by labels.

Thanks, I have followed the flow. The logic looks good. But I have got some nits and a question about the clang/test/ assembly test.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1033

1st 2nd 3rd may be confusing. You can just write: 1<<0, 1<<1, and 1<<2. Actually, you may just say: use 3 bits to encode information for more precise profile, (1) (2) (3).

The first two sentences can thus be simplified.

1047

BBAddrMapSection is always non-null. Delete the if.

1058

auto -> MachineBasicBlock

llvm/test/CodeGen/X86/basic-block-sections-labels.ll
2

LINUX-LABELS is probably not a good CHECK prefix. The feature is generic and not specific to linux (i.e. you could use -mtriple=x86_64)

rahmanl updated this revision to Diff 289765.Sep 3 2020, 11:01 AM
rahmanl marked 5 inline comments as done.
  • Address +MaskRay's comments:
  • Change the check prefix simply to "CHECK" for basic-block-sections-labels.ll
  • Change the triple to x86_64 for this test.
  • nits.
  • Remove the "-LABEL" feature from basic-block-sections-labels.ll and use precise BB indices.
  • Remove the assertion in emitBBAddrMapSection.
rahmanl added inline comments.Sep 3 2020, 11:04 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1047

I believe we return null for non-ELF environment (Please refer to MCObjectFileInfo::getBBAddrMapSection).

MaskRay added inline comments.Sep 3 2020, 3:06 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1047

In that case emitBBAddrMapSection will not be called?

rahmanl updated this revision to Diff 290511.Sep 8 2020, 9:44 AM
rahmanl marked an inline comment as done.
  • Merge remote-tracking branch 'origin/master' into arcpatch-D85408

Thanks for the review @MaskRay. Is this ready to land now?

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1047

Fair enough.

MaskRay accepted this revision.Sep 8 2020, 10:01 AM

I like the simplification:) However, I still feel puzzled by the large chunk of assembly test in clang/test/CodeGen/basic-block-sections.c. I hope another reviewer or the code owner can say that this is ok.

This revision is now accepted and ready to land.Sep 8 2020, 10:01 AM

@efriedma Would you please chime in specially with respect to @MaskRay 's comment about the clang test involving assembly?

@efriedma Would you please chime in specially with respect to @MaskRay 's comment about the clang test involving assembly?

I'll chime in, perhaps @efriedma will/can too: This patch has no changes to Clang's code, so it should have no changes to Clang's tests, generally speaking.

Whatever codepath/behavior is being tested by that Clang test change should be testable (& generally only tested) via LLVM's tools/tests, like llc, I would expect? (I guess that's what the basic-block-sections-labels.ll test is doing? Making the Clang test update redundant)

The only reason I'd expect to see the assembly tested in Clang is if the flag that enables this feature is an MCOptions (or other similar API level feature) member, rather than part of LLVM IR proper. In that case the only way to test that the flag has any effect (since we can't observe the MCOptions struct state directly in a test) is to test for the MCOptions effect on LLVM's output. Depending on the complexity, sometimes such tests are just skipped entirely (leaving the setting of the MCOption untested) - I think I did that for, for example, DWARF type units. Other times, an end-to-end test is used to ensure the flag is working. But that's all the test should do: test the flag is working, with some very basic/rudimentary property of the feature being enabled (so pretty much /any/ change to the way the feature works will still pass the test, but if the feature were turned off entirely (the MCOptions flag stopped being set correctly), the test would fail). Not test all the functionality of the feature that the flag enables - all that testing should be down in LLVM.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1058

FWIW, auto seems pretty fine to me here - looking only at for loops over MBB, a rough grep/count seems like there's certainly a lot of both (194 (non-auto) to 168 (auto)) and I think it qualifies as "obvious" for LLVM backend code that an MBB variable would be of the type MachineBasicBlock. (bonus points for obviousness that MF is a MachineFunction)

rahmanl updated this revision to Diff 291156.Sep 11 2020, 1:22 AM
  • Remove the complex assembly from the clang test.

@efriedma Would you please chime in specially with respect to @MaskRay 's comment about the clang test involving assembly?

I'll chime in, perhaps @efriedma will/can too: This patch has no changes to Clang's code, so it should have no changes to Clang's tests, generally speaking.

Whatever codepath/behavior is being tested by that Clang test change should be testable (& generally only tested) via LLVM's tools/tests, like llc, I would expect? (I guess that's what the basic-block-sections-labels.ll test is doing? Making the Clang test update redundant)

The only reason I'd expect to see the assembly tested in Clang is if the flag that enables this feature is an MCOptions (or other similar API level feature) member, rather than part of LLVM IR proper. In that case the only way to test that the flag has any effect (since we can't observe the MCOptions struct state directly in a test) is to test for the MCOptions effect on LLVM's output. Depending on the complexity, sometimes such tests are just skipped entirely (leaving the setting of the MCOption untested) - I think I did that for, for example, DWARF type units. Other times, an end-to-end test is used to ensure the flag is working. But that's all the test should do: test the flag is working, with some very basic/rudimentary property of the feature being enabled (so pretty much /any/ change to the way the feature works will still pass the test, but if the feature were turned off entirely (the MCOptions flag stopped being set correctly), the test would fail). Not test all the functionality of the feature that the flag enables - all that testing should be down in LLVM.

Thanks for the clear explanation @dblaikie. In light of your comment, I removed the complex assembly from the clang test, since the LLVM tests are already testing those.

MaskRay added inline comments.Sep 11 2020, 10:27 AM
llvm/test/CodeGen/X86/basic-block-sections-labels-functions-sections.ll
2

You can drop -unknown-linux-gnu. This is a generic ELF feature.

llvm/test/CodeGen/X86/basic-block-sections-labels.ll
2

Delete -check-prefix=CHECK
It is the default

rahmanl updated this revision to Diff 291589.Sep 14 2020, 9:13 AM
  • Remove the "labels" part of the clang test as the functionality is tested on LLVM tests.
rahmanl updated this revision to Diff 291595.Sep 14 2020, 9:40 AM
  • Merge branch 'master' into arcpatch-D85408
This revision was landed with ongoing or failed builds.Sep 14 2020, 10:17 AM
This revision was automatically updated to reflect the committed changes.