Page MenuHomePhabricator

Let -basic-block-sections=labels emit basicblock metadata in a new .bb_addr_map section, instead of emitting special unary-encoded symbols.
ClosedPublic

Authored by rahmanl on Aug 6 2020, 12:51 AM.

Details

Summary

This patch introduces the new .bb_addr_map section feature which allows us to emit the bits needed for mapping binary profiles to basic blocks into a separate section.
The format of the emitted data is represented as follows. It includes a header for every function:

Address of the function-> 8 bytes (pointer size)
Number of basic blocks in this function (>0)-> ULEB128

The header is followed by a BB record for every basic block. These records are ordered in the same order as MachineBasicBlocks are placed in the function. Each BB Info is structured as follows:

Offset of the basic block relative to function begin-> ULEB128
Binary size of the basic block-> ULEB128
BB metadata-> ULEB128 [ MBB.isReturn() OR MBB.hasTailCall() << 1 OR MBB.isEHPad() << 2 ]

The new feature will replace the existing "BB labels" functionality with -basic-block-sections=labels.
The .bb_addr_map section scrubs the specially-encoded BB symbols from the binary and makes it friendly to profilers and debuggers.
Furthermore, the new feature reduces the binary size overhead from 70% bloat to only 12%.

For more information and results please refer to the RFC: https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
rahmanl published this revision for review.Aug 26 2020, 12:37 PM
rahmanl retitled this revision from Let -basic-block-sections=labels emit basicblock metadata in a new .bb_info section, instead of emitting special unary-encoded symbols. to Let -basic-block-sections=labels emit basicblock metadata in a new .bb_addr_map section, instead of emitting special unary-encoded symbols..
wenlei added a subscriber: hoy.Aug 26 2020, 12:40 PM
rahmanl updated this revision to Diff 288104.Aug 26 2020, 1:38 PM
rahmanl edited the summary of this revision. (Show Details)

Fix failing clang test which used to check for the old ".bb_info" name.

rahmanl updated this revision to Diff 288105.Aug 26 2020, 1:39 PM

Fix failing clang test which used to check for the old ".bb_info" name.

snehasish added inline comments.Aug 26 2020, 2:11 PM
clang/docs/UsersManual.rst
1706

nit: s/would include/includes/

1707

nit: s/relative to the/relative to the parent/

llvm/lib/CodeGen/MachineBasicBlock.cpp
58

I think this method only relies on already public methods of MachineBasicBlock. It would be cleaner to move this from here to a static helper function in AsmPrinter.cpp. This way we don't add yet another method to the MachineBasicBlock class and keeps the relevant metadata generation logic close to where it is used.

Feel free to ignore if you plan on calling this method elsewhere apart from AsmPrinter.cpp but I can't think of a reason to do so.

llvm/test/CodeGen/X86/basic-block-sections-labels.ll
3–4

I think this can be removed and the branch on L10 can directly use the param like so:

define void @_Z3bazb(i1 zeroext %0) personality i32 (...)* @__gxx_personality_v0 {
  br i1 %0, label %6, label %11
  ....
}

You will need to adjust the respective virtual register names since %2-%5 is no longer used.

rahmanl updated this revision to Diff 288158.Aug 26 2020, 7:08 PM
rahmanl marked 4 inline comments as done.

Address Snehasish's comments.

  • Move getBBAddrMetadata to AsmPrinter as a static function in AsmPrinter
  • Remove unnecessary IR instructions in basic-block-sections-labels.ll
rahmanl added inline comments.Aug 26 2020, 7:09 PM
llvm/lib/CodeGen/MachineBasicBlock.cpp
58

Thanks for the suggestion. It makes the code much more readable since I just place that function above emitBBAddrMapSection.
I can even make it a lambda function within emitBBAddrMapSection. WDYT?

llvm/test/CodeGen/X86/basic-block-sections-labels.ll
3–4

Great suggestion. Thanks. This was copied over from other basic-block-sections tests, so maybe they can be updated too.

snehasish added inline comments.Aug 26 2020, 7:19 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1030

Did you intend to change the formatting of the comment? I think the list items were on separate lines.

llvm/lib/CodeGen/MachineBasicBlock.cpp
58

I have a slight preference for the static method vs inline lambda since the comment describing the metadata can be placed above the static method.

rahmanl updated this revision to Diff 288694.Aug 28 2020, 1:30 PM
  • Move getBBAddrMetadata to AsmPrinter as a static function + nits.
rahmanl marked an inline comment as done.Aug 28 2020, 1:33 PM
rahmanl added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1030

My bad. Didn't fully check the lint result.

llvm/lib/CodeGen/MachineBasicBlock.cpp
58

Makes sense.

rahmanl edited reviewers, added: craig.topper; removed: tmsriram, shenhan.Aug 28 2020, 1:45 PM
rahmanl added a subscriber: shenhan.
snehasish accepted this revision.Aug 28 2020, 4:06 PM

LGTM. Please wait a bit for additional comments from others.

This revision is now accepted and ready to land.Aug 28 2020, 4:06 PM
MaskRay added inline comments.Aug 31 2020, 12:02 PM
clang/docs/UsersManual.rst
1703–1708

Use backquotes like below

clang/test/CodeGen/basic-block-sections.c
3–4

The triple change is unneeded. You can also just use -triple x86_64 because the behavior applies to generic ELF.

MaskRay added inline comments.Aug 31 2020, 12:06 PM
llvm/lib/CodeGen/MachineBasicBlock.cpp
78
llvm/test/CodeGen/X86/basic-block-sections-labels-functions-sections.ll
34

Can you use a more specific pattern than .+? It is difficult to know which pattern it uses (a larger issue is that it matches spaces and possibly leverages a subtle property of FileCheck)

rahmanl updated this revision to Diff 289498.Sep 2 2020, 9:40 AM
rahmanl marked 5 inline comments as done.
  • Address @MaskRay's comments.
  • Rebase with upstream.

Thanks a lot for the comments @MaskRay.

MaskRay added a subscriber: rsmith.Sep 2 2020, 12:37 PM
MaskRay added inline comments.
clang/test/CodeGen/basic-block-sections.c
27–28

Assembly tests are not common in clang/test but @rsmith LGTMed D68049 so I think it should be fine. Now this file tests more machine level stuff. Should the file switch to LLVM IR?

MaskRay requested changes to this revision.EditedSep 2 2020, 12:50 PM

I am still reading the patch, but I have noticed one thing worth discussing. .bb_addr_map is a non-SHF_ALLOC section (meaning that it is not part of the memory image). An absolute relocation type (.quad .Lfunc_begin0) works but the value is a link-time address, not taking account of the load base (PIE/shared object)). (If .bb_addr_map has the SHF_ALLOC flag, linkers will report a text relocation) How do you intend to use .bb_addr_map at runtime?

This revision now requires changes to proceed.Sep 2 2020, 12:50 PM

I am still reading the patch, but I have noticed one thing worth discussing. .bb_addr_map is a non-SHF_ALLOC section (meaning that it is not part of the memory image). An absolute relocation type (.quad .Lfunc_begin0) works but the value is a link-time address, not taking account of the load base (PIE/shared object)). (If .bb_addr_map has the SHF_ALLOC flag, linkers will report a text relocation) How do you intend to use .bb_addr_map at runtime?

bb_addr_map is intentionally not loaded as it is never meant to be used at run-time. bb_addr_map will be later be used in conjunction with a hardware profile of the binary to associate executed basic block addresses to machine basic blocks. The hardware profile of the binary has enough info to determine the load base address with PIE, this is pretty straight forward and multiple strategies exist. What exactly is the concern here?

I am still reading the patch, but I have noticed one thing worth discussing. .bb_addr_map is a non-SHF_ALLOC section (meaning that it is not part of the memory image). An absolute relocation type (.quad .Lfunc_begin0) works but the value is a link-time address, not taking account of the load base (PIE/shared object)). (If .bb_addr_map has the SHF_ALLOC flag, linkers will report a text relocation) How do you intend to use .bb_addr_map at runtime?

Regarding "not taking account of the load base (PIE/shared object)" - the profile conversion tool uses addresses in .bb_addr_map section and MMAP entries in perf.data file to construct runtime addresses, so these runtime addresses can be matched with the profile addresses recorded in perf.data file.

MaskRay added a comment.EditedSep 2 2020, 2:14 PM

I am still reading the patch, but I have noticed one thing worth discussing. .bb_addr_map is a non-SHF_ALLOC section (meaning that it is not part of the memory image). An absolute relocation type (.quad .Lfunc_begin0) works but the value is a link-time address, not taking account of the load base (PIE/shared object)). (If .bb_addr_map has the SHF_ALLOC flag, linkers will report a text relocation) How do you intend to use .bb_addr_map at runtime?

bb_addr_map is intentionally not loaded as it is never meant to be used at run-time. bb_addr_map will be later be used in conjunction with a hardware profile of the binary to associate executed basic block addresses to machine basic blocks. The hardware profile of the binary has enough info to determine the load base address with PIE, this is pretty straight forward and multiple strategies exist. What exactly is the concern here?

Thanks for the explanation! I guessed that non-SHF_ALLOC is intended but hoped that it can be confirmed (and you did!). This sections works similar to .stack_sizes emitted by clang -fstack-size-section.

This is important because for SHF_ALLOC, SHF_WRITE is also needed to avoid text relocations. I am still reading the patch to ensure that it will work as intended.

llvm/test/CodeGen/X86/basic-block-sections-labels-functions-sections.ll
22

Consider align the content. Since CHECK-LABEL: is the longest.

; CHECK:        .section .text._Z3foov,"ax",@progbits
; CHECK-LABEL:	_Z3foov:
; CHECK-NEXT:     [[FOO_BEGIN:.Lfunc_begin[0-9]+]]:

Non-labels have additional indentation to make it clear they are bodies covered by labels.

Thanks, I have followed the flow. The logic looks good. But I have got some nits and a question about the clang/test/ assembly test.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1033

1st 2nd 3rd may be confusing. You can just write: 1<<0, 1<<1, and 1<<2. Actually, you may just say: use 3 bits to encode information for more precise profile, (1) (2) (3).

The first two sentences can thus be simplified.

1047

BBAddrMapSection is always non-null. Delete the if.

1058

auto -> MachineBasicBlock

llvm/test/CodeGen/X86/basic-block-sections-labels.ll
2

LINUX-LABELS is probably not a good CHECK prefix. The feature is generic and not specific to linux (i.e. you could use -mtriple=x86_64)

rahmanl updated this revision to Diff 289765.Sep 3 2020, 11:01 AM
rahmanl marked 5 inline comments as done.
  • Address +MaskRay's comments:
  • Change the check prefix simply to "CHECK" for basic-block-sections-labels.ll
  • Change the triple to x86_64 for this test.
  • nits.
  • Remove the "-LABEL" feature from basic-block-sections-labels.ll and use precise BB indices.
  • Remove the assertion in emitBBAddrMapSection.
rahmanl added inline comments.Sep 3 2020, 11:04 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1047

I believe we return null for non-ELF environment (Please refer to MCObjectFileInfo::getBBAddrMapSection).

MaskRay added inline comments.Sep 3 2020, 3:06 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1047

In that case emitBBAddrMapSection will not be called?

rahmanl updated this revision to Diff 290511.Sep 8 2020, 9:44 AM
rahmanl marked an inline comment as done.
  • Merge remote-tracking branch 'origin/master' into arcpatch-D85408

Thanks for the review @MaskRay. Is this ready to land now?

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1047

Fair enough.

MaskRay accepted this revision.Sep 8 2020, 10:01 AM

I like the simplification:) However, I still feel puzzled by the large chunk of assembly test in clang/test/CodeGen/basic-block-sections.c. I hope another reviewer or the code owner can say that this is ok.

This revision is now accepted and ready to land.Sep 8 2020, 10:01 AM

@efriedma Would you please chime in specially with respect to @MaskRay 's comment about the clang test involving assembly?

@efriedma Would you please chime in specially with respect to @MaskRay 's comment about the clang test involving assembly?

I'll chime in, perhaps @efriedma will/can too: This patch has no changes to Clang's code, so it should have no changes to Clang's tests, generally speaking.

Whatever codepath/behavior is being tested by that Clang test change should be testable (& generally only tested) via LLVM's tools/tests, like llc, I would expect? (I guess that's what the basic-block-sections-labels.ll test is doing? Making the Clang test update redundant)

The only reason I'd expect to see the assembly tested in Clang is if the flag that enables this feature is an MCOptions (or other similar API level feature) member, rather than part of LLVM IR proper. In that case the only way to test that the flag has any effect (since we can't observe the MCOptions struct state directly in a test) is to test for the MCOptions effect on LLVM's output. Depending on the complexity, sometimes such tests are just skipped entirely (leaving the setting of the MCOption untested) - I think I did that for, for example, DWARF type units. Other times, an end-to-end test is used to ensure the flag is working. But that's all the test should do: test the flag is working, with some very basic/rudimentary property of the feature being enabled (so pretty much /any/ change to the way the feature works will still pass the test, but if the feature were turned off entirely (the MCOptions flag stopped being set correctly), the test would fail). Not test all the functionality of the feature that the flag enables - all that testing should be down in LLVM.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1058

FWIW, auto seems pretty fine to me here - looking only at for loops over MBB, a rough grep/count seems like there's certainly a lot of both (194 (non-auto) to 168 (auto)) and I think it qualifies as "obvious" for LLVM backend code that an MBB variable would be of the type MachineBasicBlock. (bonus points for obviousness that MF is a MachineFunction)

rahmanl updated this revision to Diff 291156.Sep 11 2020, 1:22 AM
  • Remove the complex assembly from the clang test.

@efriedma Would you please chime in specially with respect to @MaskRay 's comment about the clang test involving assembly?

I'll chime in, perhaps @efriedma will/can too: This patch has no changes to Clang's code, so it should have no changes to Clang's tests, generally speaking.

Whatever codepath/behavior is being tested by that Clang test change should be testable (& generally only tested) via LLVM's tools/tests, like llc, I would expect? (I guess that's what the basic-block-sections-labels.ll test is doing? Making the Clang test update redundant)

The only reason I'd expect to see the assembly tested in Clang is if the flag that enables this feature is an MCOptions (or other similar API level feature) member, rather than part of LLVM IR proper. In that case the only way to test that the flag has any effect (since we can't observe the MCOptions struct state directly in a test) is to test for the MCOptions effect on LLVM's output. Depending on the complexity, sometimes such tests are just skipped entirely (leaving the setting of the MCOption untested) - I think I did that for, for example, DWARF type units. Other times, an end-to-end test is used to ensure the flag is working. But that's all the test should do: test the flag is working, with some very basic/rudimentary property of the feature being enabled (so pretty much /any/ change to the way the feature works will still pass the test, but if the feature were turned off entirely (the MCOptions flag stopped being set correctly), the test would fail). Not test all the functionality of the feature that the flag enables - all that testing should be down in LLVM.

Thanks for the clear explanation @dblaikie. In light of your comment, I removed the complex assembly from the clang test, since the LLVM tests are already testing those.

MaskRay added inline comments.Sep 11 2020, 10:27 AM
llvm/test/CodeGen/X86/basic-block-sections-labels-functions-sections.ll
2

You can drop -unknown-linux-gnu. This is a generic ELF feature.

llvm/test/CodeGen/X86/basic-block-sections-labels.ll
2

Delete -check-prefix=CHECK
It is the default

rahmanl updated this revision to Diff 291589.Sep 14 2020, 9:13 AM
  • Remove the "labels" part of the clang test as the functionality is tested on LLVM tests.
rahmanl updated this revision to Diff 291595.Sep 14 2020, 9:40 AM
  • Merge branch 'master' into arcpatch-D85408
This revision was landed with ongoing or failed builds.Sep 14 2020, 10:17 AM
This revision was automatically updated to reflect the committed changes.