This is an archive of the discontinued LLVM Phabricator instance.

Descriptive symbol names for machine basic block sections
ClosedPublic

Authored by snehasish on Apr 30 2020, 5:27 PM.

Details

Summary

Today symbol names generated for machine basic block sections use a unary encoding to reduce bloat. This is essential when every basic block in the binary is assigned a symbol however with basic block clusters (rG05192e585ce175b55f2a26b83b4ed7882785c8e6) when we only need to generate a few non-temporary symbols we can assign more descriptive names making them more user friendly. With this change -

  • Cold cluster section for function foo is named "foo.cold"
  • Exception cluster section for function foo is named "foo.eh"
  • Other cluster sections identified by their ids are named "foo.ID"

Using this format works well with existing tools. It will demangle as expected and works with existing symbolizers, profilers and debuggers out of the box.

$ c++filt _Z3foov.cold
foo() [clone .cold]

$ c++filt _Z3foov.eh
foo() [clone .eh]

$c++filt _Z3foov.1234
foo() [clone 1234]

Tests for basicblock-sections are updated with some cleanup where appropriate.

Diff Detail

Event Timeline

snehasish created this revision.Apr 30 2020, 5:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2020, 5:27 PM
tmsriram added inline comments.Apr 30 2020, 9:53 PM
llvm/lib/CodeGen/MachineBasicBlock.cpp
87

Just a comment: I like this because it will demangle as expected and works well with existing symbolizers, profilers and debuggers out of the box.

$ c++filt _Z3foov.cold
foo() [clone .cold]

$ c++filt _Z3foov.eh
foo() [clone .eh]

$c++filt _Z3foov.1234
foo() [clone 1234]

93

getParent() is just MF, so MF->getName()

snehasish updated this revision to Diff 261486.May 1 2020, 10:06 AM
snehasish edited the summary of this revision. (Show Details)
snehasish marked 3 inline comments as done.
snehasish added inline comments.
llvm/lib/CodeGen/MachineBasicBlock.cpp
87

Good point, added this to the differential description.

tmsriram accepted this revision.May 1 2020, 3:43 PM
tmsriram added a reviewer: efriedma.

LGTM, but please wait for Eli's comments before pushing.

This revision is now accepted and ready to land.May 1 2020, 3:44 PM

Some minor suggestions for the string manipulation; otherwise looks fine.

llvm/lib/CodeGen/MachineBasicBlock.cpp
83

We usually prefer something like Twine(Prefix) + ".BB." + MF->getName().

91

Not that it's likely to matter much, but Twine(SectionID.Number) is probably a little more efficient. (With std::string, the compiler has to work very hard to prove it doesn't need to allocate memory.)

Ctx.getOrCreateSymbol(MF->getName() + "." + Twine(SectionID.Number)) would also avoid the temporary string, but maybe that's not worth the readability cost.

97

You don't need to explicitly cast "_" to Twine.

snehasish updated this revision to Diff 261669.May 2 2020, 3:03 PM
snehasish marked 3 inline comments as done.

Updated string types based on reviewer comments.

snehasish marked an inline comment as done.May 2 2020, 3:07 PM

@efriedma Thanks for the pointers on LLVM string types. Updated the diff with changes, please take a look. Thanks!

llvm/lib/CodeGen/MachineBasicBlock.cpp
91

In this case SmallString operator+= accepts a StringRef so just Suffix += "." + Twine(SectionID.Number); does not compile. Rewriting this in a different way makes it less readable so I decided to leave it as it is.

This revision was automatically updated to reflect the committed changes.