This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Let --symbolize-operands symbolize basic block addresses based on the SHT_LLVM_BB_ADDR_MAP section.
ClosedPublic

Authored by rahmanl on Apr 27 2022, 2:42 PM.

Details

Summary

--symbolize-operands already symbolizes branch targets based on the disassembly. When the object file is created with -fbasic-block-sections=labels (ELF-only) it will include a SHT_LLVM_BB_ADDR_MAP section which maps basic blocks to their addresses. In such case llvm-objdump can annotate the disassembly based on labels inferred on this section.

In contrast to the current labels, SHT_LLVM_BB_ADDR_MAP-based labels are created for every machine basic block including empty blocks and those which are not branched into (fallthrough blocks).

The old logic is still executed even when the SHT_LLVM_BB_ADDR_MAP section is present to handle functions which have not been received an entry in this section.

Diff Detail

Event Timeline

rahmanl created this revision.Apr 27 2022, 2:42 PM
Herald added a reviewer: MaskRay. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
rahmanl updated this revision to Diff 425621.Apr 27 2022, 3:04 PM

Add a basic block to the test which is not branched into.

rahmanl edited the summary of this revision. (Show Details)Apr 27 2022, 3:04 PM
rahmanl edited the summary of this revision. (Show Details)
rahmanl published this revision for review.Apr 27 2022, 3:07 PM
rahmanl added subscribers: snehasish, mtrofin, shenhan and 2 others.
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2022, 3:08 PM
rahmanl edited the summary of this revision. (Show Details)Apr 27 2022, 3:15 PM
mtrofin added inline comments.Apr 27 2022, 4:25 PM
llvm/lib/Object/ELFObjectFile.cpp
681

Coding style: BBAddrMap (i.e. first letter capital, no '_')

llvm/tools/llvm-objdump/llvm-objdump.cpp
1008

I (capital)

jhenderson added inline comments.Apr 27 2022, 11:10 PM
llvm/lib/Object/ELFObjectFile.cpp
684

Errors should use lower-case first letter (and no trailing full stop). Applies below too.

689

I'd remove the exclamation mark. It feels a little unprofessional to me, if I'm honest: it's not unreasonable for such section to exist apart from anything else.

llvm/tools/llvm-objdump/llvm-objdump.cpp
991

It seems like this should at least be a warning?

1332

The Programmer's manual says not to use unordererd_map, although I doubt its reasoning is necessarily important here. Would DenseMap be more appropriate anyway?

Thanks for the reviews. Just realized this isn't working when multiple BB-address-map sections are present. Will fix soon.

When the object file is created with -basic-block-labels it will include

Typo about -basic-block-labels? Better to use a Clang driver option. If that is unavailable, use an llc option.

rahmanl updated this revision to Diff 426866.May 3 2022, 3:23 PM
rahmanl marked 6 inline comments as done.
rahmanl edited the summary of this revision. (Show Details)
  • Handled the multi-section case.
  • Added unit tests for the ELFObjectFile::readBBAddrMap API.
rahmanl edited the summary of this revision. (Show Details)May 3 2022, 3:25 PM
rahmanl added inline comments.May 3 2022, 4:07 PM
llvm/tools/llvm-objdump/llvm-objdump.cpp
991

Reporting errors now.

1332

Also changed the preexisting unordered_map to a DenseMap.

I've run out of time to properly review this today, but some more nits to be getting on with.

llvm/lib/Object/ELFObjectFile.cpp
689

I'd find this more readable (i.e. I can tell more clearly that it is going to give you the right thing).

693–696
llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-disassemble-symbolize-operands-multi-section.yaml
31

Too many blank lines

I don't know the disassembly code well enough to feel comfortable reviewing it or its corresponding testing, without sinking more time into this review than I have available, so someone else better look at those parts at least.

llvm/include/llvm/Object/ELFObjectFile.h
107
llvm/tools/llvm-objdump/llvm-objdump.cpp
1274
1277

I wonder if this should be a warning rather than an error: it's still possible to disassemble, just not with all the information.

1500
llvm/unittests/Object/ELFObjectFileTest.cpp
608

Do you have a test case where the TextSectionIndex is specified, but there are no matching sections?

I'd also expect to see checks that the retrieved entries match what was expected, but don't see any of those.

666
rahmanl updated this revision to Diff 427451.May 5 2022, 2:22 PM
rahmanl marked 8 inline comments as done.
  • Added more cases in the unit test.
llvm/unittests/Object/ELFObjectFileTest.cpp
608

Do you have a test case where the TextSectionIndex is specified, but there are no matching sections?

Would you please clarify?

I do have tests for when the requested TextSectionIndex doesn't match with sh_link of the bb-address-map sections. The call to readBBAddrMap would succeed and return an empty vector in such case -- indicating that no BB-address-map was found for that section. Do you suggest I return Error in this case?

Or do you mean a test for when the requested TextSectionIndex is not a valid text section?

I'd also expect to see checks that the retrieved entries match what was expected, but don't see any of those.

Sure. I added additional cases and also support for equality of BBAddrMaps.

rahmanl updated this revision to Diff 427457.May 5 2022, 2:34 PM
  • Update llvm-objdump.rst.
rahmanl updated this revision to Diff 427459.May 5 2022, 2:38 PM
  • Nicer BBAddrMap initialization in the unit test.
jhenderson added inline comments.May 5 2022, 11:32 PM
llvm/docs/CommandGuide/llvm-objdump.rst
229 ↗(On Diff #427459)
llvm/include/llvm/Object/ELFTypes.h
816

Did you consider using std::memcmp here instead of checking each individual field separately. The problem with the current code is that if you add a new field, it'll be easy to forget to add it in the comparison. On the other hand, std::memcmp would prevent adding non-POD fields, like std::vector etc, so may not be appropriate. Up to you.

llvm/unittests/Object/ELFObjectFileTest.cpp
608

I do have tests for when the requested TextSectionIndex doesn't match with sh_link of the bb-address-map sections. The call to readBBAddrMap would succeed and return an empty vector in such case -- indicating that no BB-address-map was found for that section. Do you suggest I return Error in this case?

No, what you've added in this latest version is what I wanted.

Aside: the test name is no longer really valid.

689
717

I think?

rahmanl updated this revision to Diff 427679.May 6 2022, 10:52 AM
rahmanl marked 6 inline comments as done.
  • nits.
llvm/include/llvm/Object/ELFTypes.h
816

Thanks for the suggestion. I would go for field-by-field comparison. I don't expect the operator to be used for anything other than unit-tests.

One remaining nit from my point of view, but as noted, I haven't attempted to review the disassembly aspects in depth.

llvm/docs/CommandGuide/llvm-objdump.rst
229 ↗(On Diff #427459)

Still missing "the" after the "i.e.".

rahmanl updated this revision to Diff 428125.May 9 2022, 10:18 AM
rahmanl marked an inline comment as done.
  • Last nit. Thanks for the review James.

@MaskRay Is this patch good in terms of disassembly?

MaskRay added a comment.EditedMay 12 2022, 11:52 AM

(I was out of town for like one week. Just came back.)

@MaskRay Is this patch good in terms of disassembly?

Looks good.


The subject may be clearer if you mention --symbolize-operands somewhere: this does not affect non --symbolize-operands functionality AFAICT.

Use one spelling for BB address map. If BB address map is canonical, avoid bb-address-map. It's also fine to refer to the concept with SHT_LLVM_BB_ADDR_MAP everywhere. (I don't think any object file format will pick up this soon)

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-disassemble-symbolize-operands-multi-section.yaml
2

Add a file-level comment stating the purpose of the test.

llvm/tools/llvm-objdump/llvm-objdump.cpp
997
1281

FunctionBBAddrMap is a const reference. std::move cannot save a copy.

Use for (BBAddrMap &FunctionBBAddrMap : ...

1447–1449

The original std::unordered_map<uint64_t, std::string> is better.

Inserting -1 and -2 (tombstone and empty) into a DenseMap<uint64_t, ...> leads to an assertion failure. The values are not realistic but we should guard against pathological cases.

int3 resigned from this revision.May 12 2022, 8:19 PM
rahmanl updated this revision to Diff 429297.May 13 2022, 11:05 AM
rahmanl marked 4 inline comments as done.

Addressed comments.

rahmanl retitled this revision from [llvm-objdump] Symbolize branch targets and basic block addresses based on the bb-address-map when present in the object file. to [llvm-objdump] Let `--symbolize-operands` symbolize basic block addresses based on the `SHT_LLVM_BB_ADDR_MAP` section..May 13 2022, 11:05 AM
rahmanl edited the summary of this revision. (Show Details)
rahmanl retitled this revision from [llvm-objdump] Let `--symbolize-operands` symbolize basic block addresses based on the `SHT_LLVM_BB_ADDR_MAP` section. to [llvm-objdump] Let --symbolize-operands symbolize basic block addresses based on the SHT_LLVM_BB_ADDR_MAP section..

Mostly looks good to me.

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-disassemble-symbolize-operands-multi-section.yaml
2

Can the test be merged into the other one?

llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-disassemble-symbolize-operands.yaml
2

Consider using wording like:
## Test that in the presence of SHT_LLVM_BB_ADDR_MAP, --symbolize-operands can display <BB.*> labels.

rahmanl updated this revision to Diff 429358.May 13 2022, 2:23 PM
rahmanl marked 2 inline comments as done.

Merged the two tests into one.

rahmanl updated this revision to Diff 429359.May 13 2022, 2:26 PM

minor changes.

MaskRay accepted this revision.May 13 2022, 2:26 PM

Looks great! But best to wait for @jhenderson

llvm/docs/CommandGuide/llvm-objdump.rst
229 ↗(On Diff #429358)

rst has different quoting rules than markdown.

This revision is now accepted and ready to land.May 13 2022, 2:26 PM
skan added inline comments.May 14 2022, 12:27 AM
llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-disassemble-symbolize-operands.yaml
22

The memory operand of cmpl disappears here. Is it on purpose?

MaskRay added inline comments.May 14 2022, 12:34 AM
llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-disassemble-symbolize-operands.yaml
22

--symbolize-operands only works with -M intel. The -M att support is quite broken but having the test helps we know the current state.

skan added inline comments.May 14 2022, 12:53 AM
llvm/test/tools/llvm-objdump/X86/elf-bbaddrmap-disassemble-symbolize-operands.yaml
22

I see. Thanks for the clarifying it.

llvm/tools/llvm-objdump/llvm-objdump.cpp
991

Do we need this clear?

jhenderson accepted this revision.May 15 2022, 11:46 PM

Looks great! But best to wait for @jhenderson

I've got no issues with the most recent changes.

rahmanl updated this revision to Diff 429752.May 16 2022, 9:53 AM
rahmanl marked 5 inline comments as done.

Fix llvm-objdump.rst.

This revision was landed with ongoing or failed builds.May 16 2022, 10:11 AM
This revision was automatically updated to reflect the committed changes.