This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
6/6
Extensions.rst
-
lib/
-
CodeGen/
-
AsmPrinter/
-
AsmPrinter.cpp
-
BasicBlockSections.cpp
-
MC/
-
MCObjectFileInfo.cpp
-
Object/
9/9
ELF.cpp
-
test/
-
CodeGen/X86/
-
X86/
-
basic-block-sections-labels-empty-function.ll
3/3
basic-block-sections-labels-functions-sections.ll
1/1
basic-block-sections-labels.ll
-
tools/llvm-readobj/ELF/
-
llvm-readobj/
-
ELF/
12/12
bb-addr-map.test
-
unittests/Object/
-
Object/
-
ELFObjectFileTest.cpp

Differential D121346

[Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks.
ClosedPublic

Authored by rahmanl on Mar 9 2022, 4:27 PM.

Download Raw Diff

Details

Reviewers

amharc
tmsriram
jhenderson
MaskRay

Commits

rG0aa6df65756d: [Propeller] Encode address offsets of basic blocks relative to the end of the…

Summary

This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of LLVM_BB_ADDR_MAP will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original section type value to SHT_LLVM_BB_ADDR_MAP_V0 and assign a new value to the SHT_LLVM_BB_ADDR_MAP section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function. This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format.

Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address.

This encoding uses smaller values compared to the existing one (offsets relative to function symbol).
Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary).
The extra two bytes (version and feature fields) incur a small 3% size overhead to the LLVM_BB_ADDR_MAP section size.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	61,810 ms	x64 debian > Clang.Driver::fsanitize.c
	60,290 ms	x64 debian > Clang.OpenMP::target_teams_distribute_parallel_for_simd_codegen_registration.cpp
	61,640 ms	x64 debian > Clang.OpenMP::target_update_codegen.cpp
	60,220 ms	x64 debian > LLVM.CodeGen/NVPTX::wmma.py

Event Timeline

rahmanl created this revision.Mar 9 2022, 4:27 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 4:27 PM

Herald added subscribers: wenlei, pengfei, rupprecht and 2 others. · View Herald Transcript

Harbormaster completed remote builds in B153460: Diff 414239.Mar 9 2022, 5:28 PM

rahmanl retitled this revision from Encode address offsets of basic blocks relative to the end of the previous basic blocks. to [Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks..Mar 9 2022, 11:08 PM

rahmanl edited the summary of this revision. (Show Details)

Cleanup.

rahmanl published this revision for review.Mar 10 2022, 12:04 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2022, 12:04 AM

Herald added subscribers: llvm-commits, MaskRay. · View Herald Transcript

Cleanup unrelated changes.

Harbormaster completed remote builds in B153494: Diff 414287.Mar 10 2022, 1:12 AM

rahmanl edited reviewers, added: tmsriram; removed: shenhan, jhenderson.Apr 11 2022, 4:00 PM

Herald added a reviewer: jhenderson. · View Herald TranscriptApr 11 2022, 4:00 PM

I think that Extensions.rst should be updated as well, including a description of the versioning scheme, the fact that the section name is now a semantically significant property, and a description of both the v0 and v1 formats.

Update Extensions.rst.

Remove empty lines.

Harbormaster completed remote builds in B161861: Diff 425872.Apr 28 2022, 2:12 PM

Add unit-tests for checking invalid bb-address-map version suffixes.

In D121346#3476988, @amharc wrote:

I think that Extensions.rst should be updated as well, including a description of the versioning scheme, the fact that the section name is now a semantically significant property, and a description of both the v0 and v1 formats.

Thanks for the note. Updated Extensions.rst.

Harbormaster completed remote builds in B161875: Diff 425903.Apr 28 2022, 4:22 PM

Would you mind taking a look @jhenderson?

Rebase.

Rebase again.

Harbormaster completed remote builds in B164758: Diff 429861.May 16 2022, 3:50 PM

Has the spec for this been finalised anywhere? My main conceren is the use of section names to have semantic importance. ELF generally tries to avoid this, hence the use of section types, and it would be a shame to introduce this approach when there are other options. It would be far more preferable to include the version number in the section data somewhere, a bit like how most DWARF sections are encoded. I can think of one other possible way of doing this: change the section type value for version 1 and upwards, and rename the original value to something like SHT_LLVM_BB_ADDR_MAP_V0. Add the version field as the first N bytes (2 or 4 probably) of the new section type. Parsers understanding the old data structure only won't recognise the new section type as a recognised format. This is good because it doesn't mislead people by printing incorrect offsets (in addition to not needing to rely on the section name).

llvm/docs/Extensions.rst
404–413
431–432	Do we need a note saying that v0 BB Addr maps may not have the version suffix in the section name?
llvm/lib/Object/ELF.cpp
640	`int` seems like an odd type for `Version`. It probably should be some unsigned type?

jhenderson added inline comments.May 17 2022, 1:27 AM

llvm/lib/Object/ELF.cpp
643–647	Coding standards say to use lower-case for first letter of error messages.
llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test
75	For V1 output, I feel like it would be useful to have both the raw offset and the calculated offset printed. I'm not sure exactly what would be the best way of doing that though.

In D121346#3518196, @jhenderson wrote:

Has the spec for this been finalised anywhere? My main conceren is the use of section names to have semantic importance. ELF generally tries to avoid this, hence the use of section types, and it would be a shame to introduce this approach when there are other options. It would be far more preferable to include the version number in the section data somewhere, a bit like how most DWARF sections are encoded. I can think of one other possible way of doing this: change the section type value for version 1 and upwards, and rename the original value to something like SHT_LLVM_BB_ADDR_MAP_V0. Add the version field as the first N bytes (2 or 4 probably) of the new section type. Parsers understanding the old data structure only won't recognise the new section type as a recognised format. This is good because it doesn't mislead people by printing incorrect offsets (in addition to not needing to rely on the section name).

Thanks for the review. And apologies for my delayed followup.
We did consider several ideas for storing the version.

Store it for each function: wasteful for object-file size
Store it once in a different COMDAT section of the object file and have the linker merge them all: Would not work for mixing different versions.
Store it outside the section as a weak symbol. (Similar to 2).
Store it inside the section metadata: For example, we can suffix the section name with the version name and then have the LLVM_BB_ADDR_MAP reading code read sections of different versions.

We chose idea 4 mostly for convenience reasons.

IIUC, your suggestion is to embed the version in the section data. The problem with this approach is that the linker must read and deduplicate the version data when linking the sections (unless if we store the version for each function separately).
Also, if we compile with different compiler versions, the linker must create multiple LLVM_BB_ADDR_MAP sections if multiple versions exist.
For these reasons I am a bit hesitant to add linker dependency to the feature, even though section-name-independence is great to have. I'd be happy to change course if we can avoid involving the linker. Any thoughts?

Herald added a subscriber: jsji. · View Herald TranscriptJun 10 2022, 2:37 PM

In D121346#3574742, @rahmanl wrote:

IIUC, your suggestion is to embed the version in the section data. The problem with this approach is that the linker must read and deduplicate the version data when linking the sections (unless if we store the version for each function separately).
Also, if we compile with different compiler versions, the linker must create multiple LLVM_BB_ADDR_MAP sections if multiple versions exist.
For these reasons I am a bit hesitant to add linker dependency to the feature, even though section-name-independence is great to have. I'd be happy to change course if we can avoid involving the linker. Any thoughts?

One thing to consider is how DWARF debug sections are designed. Most DWARF sections have a format that is something akin to the following:

header consisting of:
  unit length - 32 or 64-bit number indicating the size of this input section
  version - uint16_t for the section's version
  other metadata as appropriate for the section type
actual section payload

The linker concatenates these together into a single output section. Consumers iterate over a section by inspecting the first header, using that to parse the immediate next payload and then, if the unit length doesn't mean the end of section has been reached, parses the next header and so on. In your case, you could have a single "header" (which might just consist of a length and version), followed by many functions that conform to that header. Consumers would just have to know how to iterate over them and then, if there are multiple versions, handle the corresponding payload accordingly. The linker would just concatenate together.

In D121346#3577252, @jhenderson wrote:
In D121346#3574742, @rahmanl wrote:

IIUC, your suggestion is to embed the version in the section data. The problem with this approach is that the linker must read and deduplicate the version data when linking the sections (unless if we store the version for each function separately).
Also, if we compile with different compiler versions, the linker must create multiple LLVM_BB_ADDR_MAP sections if multiple versions exist.
For these reasons I am a bit hesitant to add linker dependency to the feature, even though section-name-independence is great to have. I'd be happy to change course if we can avoid involving the linker. Any thoughts?

One thing to consider is how DWARF debug sections are designed. Most DWARF sections have a format that is something akin to the following:
header consisting of:
  unit length - 32 or 64-bit number indicating the size of this input section
  version - uint16_t for the section's version
  other metadata as appropriate for the section type
actual section payload
The linker concatenates these together into a single output section. Consumers iterate over a section by inspecting the first header, using that to parse the immediate next payload and then, if the unit length doesn't mean the end of section has been reached, parses the next header and so on. In your case, you could have a single "header" (which might just consist of a length and version), followed by many functions that conform to that header. Consumers would just have to know how to iterate over them and then, if there are multiple versions, handle the corresponding payload accordingly. The linker would just concatenate together.

Thanks for the explanation. If we use -function-sections it also means that we'll generate a unique LLVM_BB_ADDR_MAP per function. In this case, I believe the version data will be repeated for every function. Correct? I think we can live with that for now. It's only one or two bytes per function.

In D121346#3580806, @rahmanl wrote:

Thanks for the explanation. If we use -function-sections it also means that we'll generate a unique LLVM_BB_ADDR_MAP per function. In this case, I believe the version data will be repeated for every function. Correct? I think we can live with that for now. It's only one or two bytes per function.

Yes, that's what I'd expect. (It's worth noting that -function-sections imposes other overheads like the ELF section header, so a couple of bytes is comparatively small).

In D121346#3580925, @jhenderson wrote:

In D121346#3580806, @rahmanl wrote:

Thanks for the explanation. If we use -function-sections it also means that we'll generate a unique LLVM_BB_ADDR_MAP per function. In this case, I believe the version data will be repeated for every function. Correct? I think we can live with that for now. It's only one or two bytes per function.

Yes, that's what I'd expect. (It's worth noting that -function-sections imposes other overheads like the ELF section header, so a couple of bytes is comparatively small).

Correct, but with the difference that the ELF section header won't be repeated many times in the final linked section, but the version number will.

Encode the version number as a field of each function's LLVM_BB_ADDR_MAP entry instead of section names.

rahmanl edited the summary of this revision. (Show Details)Jun 16 2022, 1:24 AM

Harbormaster completed remote builds in B170202: Diff 437461.Jun 16 2022, 1:27 AM

rahmanl edited the summary of this revision. (Show Details)Jun 16 2022, 11:22 AM

Fix tests.

rahmanl added inline comments.Jun 16 2022, 2:10 PM

llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test
75	I think we should only care about the final calculated offset for verification. The raw offset is just an encoding technicality and should not be given much semantic importance.

clang-format.

Harbormaster completed remote builds in B170373: Diff 437706.Jun 16 2022, 4:12 PM

jhenderson added inline comments.Jun 17 2022, 1:42 AM

llvm/docs/Extensions.rst
399	Does this need extending?
453	Nit: looks like this line has gained some trailing whitespace somehow.
llvm/lib/Object/ELF.cpp
683–686	Nit: no need for braces here.
llvm/test/CodeGen/X86/basic-block-sections-labels.ll
49–50	It would be good if these could have comments in the asm indicating what they represent (i.e. version and feature), for those not familiar with the format.
llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test
165	I wonder if it would be better to link against the same section? This would allow you to compare the differences more easily.
llvm/test/tools/obj2yaml/ELF/bb-addr-map.yaml
109 ↗	(On Diff #437706)	Typo
129 ↗	(On Diff #437706)	I actually think the Version field should be mandatory. It seems odd to pin the default to the oldest version, but we also shouldn't have it change when a new version is added as otherwise it'll cause existing YAML to change behaviour.
132 ↗	(On Diff #437706)	Nit: let's line things up.
181 ↗	(On Diff #437706)	Nit
llvm/tools/obj2yaml/elf2yaml.cpp
897 ↗	(On Diff #437706)	We probably should emit an error for unsupported versions. The file format may change in a future version such that the existing parsing will break in nasty ways. Same probably goes for llvm-readobj.

Address comments

Herald added a reviewer: MaskRay. · View Herald TranscriptJun 19 2022, 12:04 AM

rahmanl added inline comments.Jun 19 2022, 12:04 AM

llvm/docs/Extensions.rst
399	I interpreted your comment as we should remove it. Did you mean we should add a separate extension for this?
llvm/lib/Object/ELF.cpp
683–686	Also removed braces elsewhere.

Herald added a subscriber: StephenFan. · View Herald TranscriptJun 19 2022, 12:04 AM

Harbormaster completed remote builds in B170710: Diff 438177.Jun 19 2022, 12:05 AM

jhenderson added inline comments.Jun 20 2022, 1:30 AM

llvm/docs/Extensions.rst
399	I was referring to the underline, which didn't match the modified header length sorry for the confusion! I'm happy either way with having just the "normal" title, or both mentioned in the header.
llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test
61
87	This didn't occur to me until now, but it's unfortunate that we have to have duplicate check patterns and near-duplicate YAML to do the v0 comparison check. I believe we can avoid it as follows: Have an additional YAML file that just describes the section, with the Type (and potentially Version) field parameterised. Create two ELF objects from this YAML, one with each of the two section types, the newer type having an explicit Version 0. Run llvm-readobj twice, to dump each of them individually. Use the same check pattern for the pair of these invocations. What do you think?
168	Ah, that's an unfortunate side-effect. I think we should aim to avoid it somehow. About the best idea I have for this is to use different struct types in the ELFYAML code for SHT_LLVM_BB_ADDR_MAP_V0 entries and those in SHT_LLVM_BB_ADDR_MAP sections. This also means you can't set Features when it doesn't make sense (which is a good thing).
llvm/test/tools/obj2yaml/ELF/bb-addr-map.yaml
133–135 ↗	(On Diff #438177)	Any particular reason you have a double space between the colon and value here and below?

Use the same checks for SHT_LLVM_BB_ADDR_MAP (with version=0) and SHT_LLVM_BB_ADDR_MAP_V0.

Thanks for the review @jhenderson

llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test
168	The problem is I would have to add alternative structs for `BBAddrMapSection` and `BBAddrMapEntry` and also define new mapping functions and `writeSectionContent` (with mostly identical code) for the `SHT_LLVM_BB_ADDR_MAP_V0` type. We should be able to fully deprecate `SHT_LLVM_BB_ADDR_MAP_V0` in a few months. So maybe this test won't stay around for too long. Of course, future versions will still use the same `SHT_LLVM_BB_ADDR_MAP` section type and therefore, new YAML fields will be optional (even if they are required for the new versions). So we won't have a major issue. Can I keep it as is?
llvm/test/tools/obj2yaml/ELF/bb-addr-map.yaml
133–135 ↗	(On Diff #438177)	It should be 3 spaces because of `NumBlocks` being used sometime. Aligned the YAML keys in this test more carefully.

Harbormaster completed remote builds in B171139: Diff 438764.Jun 21 2022, 12:43 PM

jhenderson added inline comments.Jun 23 2022, 12:44 AM

llvm/lib/Object/ELF.cpp
684–685	Test case? Also, the type is "SHT_LLVM_BB_ADDR_MAP", so probably wants to include the SHT_ too, to match (and be consistent with other error messages)
llvm/test/CodeGen/X86/basic-block-sections-labels-functions-sections.ll
13	Should we instead be including the version etc bytes? (I don't mind, just trying to understand the thought process)
35	If you're adding the comment here, I'd also add it to the other cases above (plus it makes it more robust, since it reduces the chance of spurious matches)
llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test
168	Yeah, leave as-is. Thanks for the explanation.
172	Nit: double blank line.
180	Super Nit: here and throughout, --check-prefixes -> --check-prefix when there's only one prefix to check (optional though - if you prefer to leave as-is, that's fine).
208	Nit: spurious extra line?
llvm/test/tools/obj2yaml/ELF/bb-addr-map.yaml
133–135 ↗	(On Diff #438177)	FWIW, I only align within the individual block, so here, I'd align with only the single space, and then use 3 spaces where NumBlocks is present. I don't care really though, as long as the spacing doesn't get excessive (at which point it can make readability an issue).
llvm/test/tools/yaml2obj/ELF/bb-addr-map.yaml
79–81 ↗	(On Diff #438764)	Nit: these should line up.
136 ↗	(On Diff #438764)	Nit: this should line up.
142 ↗	(On Diff #438764)	Nit: for consistent formatting, add a blank line before the YAML.

Cleanups.

Cleanups.

llvm/test/CodeGen/X86/basic-block-sections-labels-functions-sections.ll
13	You're right. We can do that.
llvm/test/tools/obj2yaml/ELF/bb-addr-map.yaml
133–135 ↗	(On Diff #438177)	Thanks. Adopted your approach.

Cleanups.

Cleanups.

Harbormaster completed remote builds in B171702: Diff 439537.Jun 23 2022, 3:34 PM

jhenderson added inline comments.Jun 23 2022, 11:24 PM

llvm/lib/Object/ELF.cpp
684–685	Looks like there's still no test case?
llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test
152	FWIW, there are still 2 spaces here, rather than just 1.

Cleanups.

llvm/lib/Object/ELF.cpp
684–685	Sorry, my response wasn't sent: I can't add a test to exercise this because I can't make a valid Yaml with an unsupported version number (`ELFEmitter.cpp` returns error if I specify version> 1), but I also don't think it's a good idea to remove that error handling. What do you suggest?

Harbormaster completed remote builds in B171895: Diff 439818.Jun 24 2022, 11:18 AM

jhenderson added inline comments.Jun 27 2022, 12:53 AM

llvm/lib/Object/ELF.cpp
684–685	Hmm, good point. What do you think about the following proposal: Emit a warning rather than an error with yaml2obj. In this case, treat it as the max supported version (i.e. 1) and generate data like that, except with a value 2 for the Version field. YAML is really only used for testing, so emitting an error blocks us from testing the actual production code we want to test, which seems unfortunate! The alternative approach would be to use assembly, right?

Add a llvm-readobj unit test with unsupported versions.

llvm/lib/Object/ELF.cpp
684–685	Done. Thanks for the suggestion.

Harbormaster completed remote builds in B172373: Diff 440457.Jun 27 2022, 8:50 PM

Two nits, otherwise LGTM.

llvm/lib/ObjectYAML/ELFEmitter.cpp
1401 ↗	(On Diff #440457)	Nit: semi-colon rather than comma is probably more correct
llvm/test/tools/yaml2obj/ELF/bb-addr-map.yaml
138 ↗	(On Diff #440457)

This revision is now accepted and ready to land.Jun 28 2022, 1:25 AM

Final nits.

This revision was landed with ongoing or failed builds.Jun 28 2022, 7:43 AM

Closed by commit rG0aa6df65756d: [Propeller] Encode address offsets of basic blocks relative to the end of the… (authored by rahmanl). · Explain Why

This revision was automatically updated to reflect the committed changes.

rahmanl added a commit: rG0aa6df65756d: [Propeller] Encode address offsets of basic blocks relative to the end of the….

Harbormaster completed remote builds in B172481: Diff 440621.Jun 28 2022, 8:26 AM

Revision Contents

Path

Size

llvm/

docs/

Extensions.rst

38 lines

lib/

CodeGen/

AsmPrinter/

AsmPrinter.cpp

7 lines

BasicBlockSections.cpp

2 lines

MC/

MCObjectFileInfo.cpp

2 lines

Object/

ELF.cpp

21 lines

test/

CodeGen/

X86/

basic-block-sections-labels-empty-function.ll

2 lines

basic-block-sections-labels-functions-sections.ll

6 lines

basic-block-sections-labels.ll

10 lines

tools/

llvm-readobj/

ELF/

bb-addr-map.test

145 lines

unittests/

Object/

ELFObjectFileTest.cpp

43 lines

Diff 429861

llvm/docs/Extensions.rst

Show First 20 Lines • Show All 390 Lines • ▼ Show 20 Lines

.. code-block:: gas

.section ".llvm_sympart","",@llvm_sympart

.asciz "libpartition.so"

.word symbol_in_partition

.. _partition: https://lld.llvm.org/Partitions.html

``SHT_LLVM_BB_ADDR_MAP`` Section (basic block address map)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

jhendersonUnsubmitted

Done

Does this need extending?

jhenderson: Does this need extending?

rahmanlAuthorUnsubmitted

Done

I interpreted your comment as we should remove it. Did you mean we should add a separate extension for this?

rahmanl: I interpreted your comment as we should remove it. Did you mean we should add a separate…

jhendersonUnsubmitted

Done

I was referring to the underline, which didn't match the modified header length sorry for the confusion! I'm happy either way with having just the "normal" title, or both mentioned in the header.

jhenderson: I was referring to the underline, which didn't match the modified header length sorry for the…

This section stores the binary address of basic blocks along with other related

metadata. This information can be used to map binary profiles (like perf

profiles) directly to machine basic blocks.

This section is emitted with ``-basic-block-sections=labels`` and will contain

a BB address map table for every function which may be constructed as follows:

a BB address map table for every function.

This feature provides backward compatibility to allow reading older versions of

the BB address map generated by older compilers.

The section name will include a version suffix (`.v#{version-number}`) which

specifies the version to use. The follwoing versioning schemes are currently

supported.

Version 1 (newest): basic block address offsets are computed relative to end of

previous blocks.

jhendersonUnsubmitted

Done

This section is emitted with ``-basic-block-sections=labels`` and will contain

a BB address map table for every function.

This feature provides backward compatibility to allow reading older versions of

the BB address map generated by older compilers.

The section name will include a version suffix (`.v#{version-number}`) which

- specifies the version to use. The follwoing versioning schemes are currently

+ specifies the version to use. The following versioning schemes are currently

supported.

- Version 1 (newest): basic block address offsets are computed relative to end of

- previous blocks.

+ Version 1 (newest): basic block address offsets are computed relative to the end

+ of previous blocks.

Example:

jhenderson:

Example:

.. code-block:: gas

.section ".llvm_bb_addr_map","",@llvm_bb_addr_map

.section ".llvm_bb_addr_map.v1","",@llvm_bb_addr_map

.quad .Lfunc_begin0 # address of the function

.byte 2 # number of basic blocks

# BB record for BB_0

.uleb128 .Lfunc_beign0-.Lfunc_begin0 # BB_0 offset relative to function entry (always zero)

.uleb128 .LBB_END0_0-.Lfunc_begin0 # BB_0 size

.byte x # BB_0 metadata

# BB record for BB_1

.uleb128 .LBB0_1-.Lfunc_begin0 # BB_1 offset relative to function entry

.uleb128 .LBB0_1-.LBB_END0_0 # BB_1 offset relative to the end of last block (BB_0).

.uleb128 .LBB_END0_1-.Lfunc_begin0 # BB_1 size

.uleb128 .LBB_END0_1-.LBB0_1 # BB_1 size

.byte y # BB_1 metadata

This creates a BB address map table for a function with two basic blocks.

Version 0: basic block address offsets are computed relative to the function

address.

jhendersonUnsubmitted

Done

Do we need a note saying that v0 BB Addr maps may not have the version suffix in the section name?

jhenderson: Do we need a note saying that v0 BB Addr maps may not have the version suffix in the section…

Example:

.. code-block:: gas

.section ".llvm_bb_addr_map.v0","",@llvm_bb_addr_map

.quad .Lfunc_begin0 # address of the function

.byte 2 # number of basic blocks

# BB record for BB_0

.uleb128 .Lfunc_beign0-.Lfunc_begin0 # BB_0 offset relative to function entry (always zero)

.uleb128 .LBB_END0_0-.Lfunc_begin0 # BB_0 size

.byte x # BB_0 metadata

# BB record for BB_1

.uleb128 .LBB0_1-.Lfunc_begin0 # BB_1 offset relative to function entry

.uleb128 .LBB_END0_1-.LBB0_1 # BB_1 size

.byte y # BB_1 metadata

CodeView-Dependent

------------------

``.cv_file`` Directive

jhendersonUnsubmitted

Done

Nit: looks like this line has gained some trailing whitespace somehow.

jhenderson: Nit: looks like this line has gained some trailing whitespace somehow.

^^^^^^^^^^^^^^^^^^^^^^

Syntax:

``.cv_file`` *FileNumber FileName* [ *checksum* ] [ *checksumkind* ]

``.cv_func_id`` Directive

^^^^^^^^^^^^^^^^^^^^^^^^^

Introduces a function ID that can be used with ``.cv_loc``.

▲ Show 20 Lines • Show All 154 Lines • Show Last 20 Lines

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

Show First 20 Lines • Show All 1,306 Lines • ▼ Show 20 Lines	void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {

const MCSymbol *FunctionSymbol = getFunctionBegin();		const MCSymbol *FunctionSymbol = getFunctionBegin();

OutStreamer->PushSection();		OutStreamer->PushSection();
OutStreamer->SwitchSection(BBAddrMapSection);		OutStreamer->SwitchSection(BBAddrMapSection);
OutStreamer->emitSymbolValue(FunctionSymbol, getPointerSize());		OutStreamer->emitSymbolValue(FunctionSymbol, getPointerSize());
// Emit the total number of basic blocks in this function.		// Emit the total number of basic blocks in this function.
OutStreamer->emitULEB128IntValue(MF.size());		OutStreamer->emitULEB128IntValue(MF.size());
		const MCSymbol *PrevMBBEndSymbol = FunctionSymbol;
// Emit BB Information for each basic block in the funciton.		// Emit BB Information for each basic block in the funciton.
for (const MachineBasicBlock &MBB : MF) {		for (const MachineBasicBlock &MBB : MF) {
const MCSymbol *MBBSymbol =		const MCSymbol *MBBSymbol =
MBB.isEntryBlock() ? FunctionSymbol : MBB.getSymbol();		MBB.isEntryBlock() ? FunctionSymbol : MBB.getSymbol();
// Emit the basic block offset.		// Emit the basic block offset relative to the end of the previous block.
emitLabelDifferenceAsULEB128(MBBSymbol, FunctionSymbol);		// This is zero unless the block is padded due to alignment.
		emitLabelDifferenceAsULEB128(MBBSymbol, PrevMBBEndSymbol);
// Emit the basic block size. When BBs have alignments, their size cannot		// Emit the basic block size. When BBs have alignments, their size cannot
// always be computed from their offsets.		// always be computed from their offsets.
emitLabelDifferenceAsULEB128(MBB.getEndSymbol(), MBBSymbol);		emitLabelDifferenceAsULEB128(MBB.getEndSymbol(), MBBSymbol);
OutStreamer->emitULEB128IntValue(getBBAddrMapMetadata(MBB));		OutStreamer->emitULEB128IntValue(getBBAddrMapMetadata(MBB));
		PrevMBBEndSymbol = MBB.getEndSymbol();
}		}
OutStreamer->PopSection();		OutStreamer->PopSection();
}		}

void AsmPrinter::emitPseudoProbe(const MachineInstr &MI) {		void AsmPrinter::emitPseudoProbe(const MachineInstr &MI) {
if (PP) {		if (PP) {
auto GUID = MI.getOperand(0).getImm();		auto GUID = MI.getOperand(0).getImm();
auto Index = MI.getOperand(1).getImm();		auto Index = MI.getOperand(1).getImm();
▲ Show 20 Lines • Show All 2,500 Lines • Show Last 20 Lines

llvm/lib/CodeGen/BasicBlockSections.cpp

	Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
	// needs special handling with basic block sections. DebugInfo needs to be			// needs special handling with basic block sections. DebugInfo needs to be
	// emitted with more relocations as basic block sections can break a			// emitted with more relocations as basic block sections can break a
	// function into potentially several disjoint pieces, and CFI needs to be			// function into potentially several disjoint pieces, and CFI needs to be
	// emitted per cluster. This also bloats the object file and binary sizes.			// emitted per cluster. This also bloats the object file and binary sizes.
	//			//
	// Basic Block Labels			// Basic Block Labels
	// ==================			// ==================
	//			//
	// With -fbasic-block-sections=labels, we emit the offsets of BB addresses of			// With -fbasic-block-sections=labels, we encode the offsets of BB addresses of
	// every function into the .llvm_bb_addr_map section. Along with the function			// every function into the .llvm_bb_addr_map section. Along with the function
	// symbols, this allows for mapping of virtual addresses in PMU profiles back to			// symbols, this allows for mapping of virtual addresses in PMU profiles back to
	// the corresponding basic blocks. This logic is implemented in AsmPrinter. This			// the corresponding basic blocks. This logic is implemented in AsmPrinter. This
	// pass only assigns the BBSectionType of every function to ``labels``.			// pass only assigns the BBSectionType of every function to ``labels``.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "llvm/ADT/Optional.h"			#include "llvm/ADT/Optional.h"
	▲ Show 20 Lines • Show All 461 Lines • Show Last 20 Lines

llvm/lib/MC/MCObjectFileInfo.cpp

Show First 20 Lines • Show All 1,105 Lines • ▼ Show 20 Lines	MCObjectFileInfo::getBBAddrMapSection(const MCSection &TextSec) const {
StringRef GroupName;		StringRef GroupName;
if (const MCSymbol *Group = ElfSec.getGroup()) {		if (const MCSymbol *Group = ElfSec.getGroup()) {
GroupName = Group->getName();		GroupName = Group->getName();
Flags \|= ELF::SHF_GROUP;		Flags \|= ELF::SHF_GROUP;
}		}

// Use the text section's begin symbol and unique ID to create a separate		// Use the text section's begin symbol and unique ID to create a separate
// .llvm_bb_addr_map section associated with every unique text section.		// .llvm_bb_addr_map section associated with every unique text section.
return Ctx->getELFSection(".llvm_bb_addr_map", ELF::SHT_LLVM_BB_ADDR_MAP,		return Ctx->getELFSection(".llvm_bb_addr_map.v1", ELF::SHT_LLVM_BB_ADDR_MAP,
Flags, 0, GroupName, true, ElfSec.getUniqueID(),		Flags, 0, GroupName, true, ElfSec.getUniqueID(),
cast<MCSymbolELF>(TextSec.getBeginSymbol()));		cast<MCSymbolELF>(TextSec.getBeginSymbol()));
}		}

MCSection *		MCSection *
MCObjectFileInfo::getPseudoProbeSection(const MCSection *TextSec) const {		MCObjectFileInfo::getPseudoProbeSection(const MCSection *TextSec) const {
if (Ctx->getObjectFileType() == MCContext::IsELF) {		if (Ctx->getObjectFileType() == MCContext::IsELF) {
const auto ElfSec = static_cast<const MCSectionELF >(TextSec);		const auto ElfSec = static_cast<const MCSectionELF >(TextSec);
Show All 35 Lines

llvm/lib/Object/ELF.cpp

Show First 20 Lines • Show All 625 Lines • ▼ Show 20 Lines return createError("can't map virtual address 0x" +

Twine::utohexstr(getBufSize()) + ")"); Twine::utohexstr(getBufSize()) + ")");

return base() + Offset; return base() + Offset;

} }

template <class ELFT> template <class ELFT>

Expected<std::vector<BBAddrMap>> Expected<std::vector<BBAddrMap>>

ELFFile<ELFT>::decodeBBAddrMap(const Elf_Shdr &Sec) const { ELFFile<ELFT>::decodeBBAddrMap(const Elf_Shdr &Sec) const {

Expected<StringRef> SectionNameOrErr = getSectionName(Sec);

if (!SectionNameOrErr)

return SectionNameOrErr.takeError();

StringRef VersionStr = SectionNameOrErr->rsplit('.').second;

// Without a version suffix we assume version=0.

// TODO: Report error in this case when version 0 becomes obsolete.

int Version = 0;

jhendersonUnsubmitted

Done

int seems like an odd type for Version. It probably should be some unsigned type?

jhenderson: `int` seems like an odd type for `Version`. It probably should be some unsigned type?

if (!VersionStr.empty() && VersionStr.startswith("v")) {

if (VersionStr.substr(1).getAsInteger(10, Version))

return createError("Unable to parse bb-address-map version suffix: " +

VersionStr);

if (Version > 1)

return createError("Unsupported bb-address-map version: " +

Twine(Version));

jhendersonUnsubmitted

Done

if (VersionStr.substr(1).getAsInteger(10, Version))

- return createError("Unable to parse bb-address-map version suffix: " +

+ return createError("unable to parse bb-address-map version suffix: " +

VersionStr);

if (Version > 1)

- return createError("Unsupported bb-address-map version: " +

+ return createError("unsupported bb-address-map version: " +

Twine(Version));

}

Expected<ArrayRef<uint8_t>> ContentsOrErr = getSectionContents(Sec);

Coding standards say to use lower-case for first letter of error messages.

jhenderson: Coding standards say to use lower-case for first letter of error messages.

}

Expected<ArrayRef<uint8_t>> ContentsOrErr = getSectionContents(Sec); Expected<ArrayRef<uint8_t>> ContentsOrErr = getSectionContents(Sec);

if (!ContentsOrErr) if (!ContentsOrErr)

return ContentsOrErr.takeError(); return ContentsOrErr.takeError();

ArrayRef<uint8_t> Content = *ContentsOrErr; ArrayRef<uint8_t> Content = *ContentsOrErr;

DataExtractor Data(Content, isLE(), ELFT::Is64Bits ? 8 : 4); DataExtractor Data(Content, isLE(), ELFT::Is64Bits ? 8 : 4);

std::vector<BBAddrMap> FunctionEntries; std::vector<BBAddrMap> FunctionEntries;

DataExtractor::Cursor Cur(0); DataExtractor::Cursor Cur(0);

Show All 17 Lines auto ReadULEB128AsUInt32 = [&Data, &Cur, &ULEBSizeErr]() -> uint32_t {

} }

return static_cast<uint32_t>(Value); return static_cast<uint32_t>(Value);

}; };

while (!ULEBSizeErr && Cur && Cur.tell() < Content.size()) { while (!ULEBSizeErr && Cur && Cur.tell() < Content.size()) {

uintX_t Address = static_cast<uintX_t>(Data.getAddress(Cur)); uintX_t Address = static_cast<uintX_t>(Data.getAddress(Cur));

uint32_t NumBlocks = ReadULEB128AsUInt32(); uint32_t NumBlocks = ReadULEB128AsUInt32();

std::vector<BBAddrMap::BBEntry> BBEntries; std::vector<BBAddrMap::BBEntry> BBEntries;

uint32_t PrevBBEndOffset = 0;

for (uint32_t BlockID = 0; !ULEBSizeErr && Cur && (BlockID < NumBlocks); for (uint32_t BlockID = 0; !ULEBSizeErr && Cur && (BlockID < NumBlocks);

++BlockID) { ++BlockID) {

uint32_t Offset = ReadULEB128AsUInt32(); uint32_t Offset = ReadULEB128AsUInt32();

jhendersonUnsubmitted

Done

Test case?

Also, the type is "SHT_LLVM_BB_ADDR_MAP", so probably wants to include the SHT_ too, to match (and be consistent with other error messages)

jhenderson: Test case? Also, the type is "SHT_LLVM_BB_ADDR_MAP", so probably wants to include the SHT_ too…

jhendersonUnsubmitted

Done

Looks like there's still no test case?

jhenderson: Looks like there's still no test case?

rahmanlAuthorUnsubmitted

Done

Sorry, my response wasn't sent: I can't add a test to exercise this because I can't make a valid Yaml with an unsupported version number (ELFEmitter.cpp returns error if I specify version> 1), but I also don't think it's a good idea to remove that error handling. What do you suggest?

rahmanl: Sorry, my response wasn't sent: I can't add a test to exercise this because I can't make a…

jhendersonUnsubmitted

Done

Hmm, good point. What do you think about the following proposal:

Emit a warning rather than an error with yaml2obj.
In this case, treat it as the max supported version (i.e. 1) and generate data like that, except with a value 2 for the Version field.

YAML is really only used for testing, so emitting an error blocks us from testing the actual production code we want to test, which seems unfortunate!

The alternative approach would be to use assembly, right?

jhenderson: Hmm, good point. What do you think about the following proposal: 1) Emit a warning rather than…

rahmanlAuthorUnsubmitted

Done

Done. Thanks for the suggestion.

rahmanl: Done. Thanks for the suggestion.

uint32_t Size = ReadULEB128AsUInt32(); uint32_t Size = ReadULEB128AsUInt32();

jhendersonUnsubmitted

Done

Nit: no need for braces here.

jhenderson: Nit: no need for braces here.

rahmanlAuthorUnsubmitted

Done

Also removed braces elsewhere.

rahmanl: Also removed braces elsewhere.

uint32_t Metadata = ReadULEB128AsUInt32(); uint32_t Metadata = ReadULEB128AsUInt32();

if (Version >= 1) {

// Offset is calculated relative to the end of the previous BB.

Offset += PrevBBEndOffset;

PrevBBEndOffset = Offset + Size;

}

BBEntries.push_back({Offset, Size, Metadata}); BBEntries.push_back({Offset, Size, Metadata});

} }

FunctionEntries.push_back({Address, BBEntries}); FunctionEntries.push_back({Address, BBEntries});

} }

// Either Cur is in the error state, or ULEBSizeError is set (not both), but // Either Cur is in the error state, or ULEBSizeError is set (not both), but

// we join the two errors here to be safe. // we join the two errors here to be safe.

if (!Cur || ULEBSizeErr) if (!Cur || ULEBSizeErr)

return joinErrors(Cur.takeError(), std::move(ULEBSizeErr)); return joinErrors(Cur.takeError(), std::move(ULEBSizeErr));

return FunctionEntries; return FunctionEntries;

} }

template class llvm::object::ELFFile<ELF32LE>; template class llvm::object::ELFFile<ELF32LE>;

template class llvm::object::ELFFile<ELF32BE>; template class llvm::object::ELFFile<ELF32BE>;

template class llvm::object::ELFFile<ELF64LE>; template class llvm::object::ELFFile<ELF64LE>;

template class llvm::object::ELFFile<ELF64BE>; template class llvm::object::ELFFile<ELF64BE>;

llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll

	Show All 11 Lines

	define void @func() {			define void @func() {
	entry:			entry:
	ret void			ret void
	}			}

	; CHECK: func:			; CHECK: func:
	; CHECK: .Lfunc_begin1:			; CHECK: .Lfunc_begin1:
	; CHECK: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text{{$}}			; CHECK: .section .llvm_bb_addr_map.v1,"o",@llvm_bb_addr_map,.text{{$}}
	; CHECK: .quad .Lfunc_begin1			; CHECK: .quad .Lfunc_begin1

llvm/test/CodeGen/X86/basic-block-sections-labels-functions-sections.ll

	; RUN: llc < %s -mtriple=x86_64 -function-sections -basic-block-sections=labels \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64 -function-sections -basic-block-sections=labels \| FileCheck %s

	$_Z4fooTIiET_v = comdat any			$_Z4fooTIiET_v = comdat any

	define dso_local i32 @_Z3barv() {			define dso_local i32 @_Z3barv() {
	ret i32 0			ret i32 0
	}			}
	;; Check we add SHF_LINK_ORDER for .llvm_bb_addr_map and link it with the corresponding .text sections.			;; Check we add SHF_LINK_ORDER for .llvm_bb_addr_map and link it with the corresponding .text sections.
	; CHECK: .section .text._Z3barv,"ax",@progbits			; CHECK: .section .text._Z3barv,"ax",@progbits
	; CHECK-LABEL: _Z3barv:			; CHECK-LABEL: _Z3barv:
	; CHECK-NEXT: [[BAR_BEGIN:.Lfunc_begin[0-9]+]]:			; CHECK-NEXT: [[BAR_BEGIN:.Lfunc_begin[0-9]+]]:
	; CHECK: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text._Z3barv{{$}}			; CHECK: .section .llvm_bb_addr_map.v1,"o",@llvm_bb_addr_map,.text._Z3barv{{$}}
	; CHECK-NEXT: .quad [[BAR_BEGIN]]			; CHECK-NEXT: .quad [[BAR_BEGIN]]
				jhendersonUnsubmitted Done Reply Inline Actions Should we instead be including the version etc bytes? (I don't mind, just trying to understand the thought process) jhenderson: Should we instead be including the version etc bytes? (I don't mind, just trying to understand…
				rahmanlAuthorUnsubmitted Done Reply Inline Actions You're right. We can do that. rahmanl: You're right. We can do that.


	define dso_local i32 @_Z3foov() {			define dso_local i32 @_Z3foov() {
	%1 = call i32 @_Z4fooTIiET_v()			%1 = call i32 @_Z4fooTIiET_v()
	ret i32 %1			ret i32 %1
	}			}
	; CHECK: .section .text._Z3foov,"ax",@progbits			; CHECK: .section .text._Z3foov,"ax",@progbits
	; CHECK-LABEL: _Z3foov:			; CHECK-LABEL: _Z3foov:
	; CHECK-NEXT: [[FOO_BEGIN:.Lfunc_begin[0-9]+]]:			; CHECK-NEXT: [[FOO_BEGIN:.Lfunc_begin[0-9]+]]:
	; CHECK: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text._Z3foov{{$}}			; CHECK: .section .llvm_bb_addr_map.v1,"o",@llvm_bb_addr_map,.text._Z3foov{{$}}
	; CHECK-NEXT: .quad [[FOO_BEGIN]]			; CHECK-NEXT: .quad [[FOO_BEGIN]]


	define linkonce_odr dso_local i32 @_Z4fooTIiET_v() comdat {			define linkonce_odr dso_local i32 @_Z4fooTIiET_v() comdat {
	ret i32 0			ret i32 0
	}			}
	;; Check we add .llvm_bb_addr_map section to a COMDAT group with the corresponding .text section if such a COMDAT exists.			;; Check we add .llvm_bb_addr_map section to a COMDAT group with the corresponding .text section if such a COMDAT exists.
	; CHECK: .section .text._Z4fooTIiET_v,"axG",@progbits,_Z4fooTIiET_v,comdat			; CHECK: .section .text._Z4fooTIiET_v,"axG",@progbits,_Z4fooTIiET_v,comdat
	; CHECK-LABEL: _Z4fooTIiET_v:			; CHECK-LABEL: _Z4fooTIiET_v:
	; CHECK-NEXT: [[FOOCOMDAT_BEGIN:.Lfunc_begin[0-9]+]]:			; CHECK-NEXT: [[FOOCOMDAT_BEGIN:.Lfunc_begin[0-9]+]]:
	; CHECK: .section .llvm_bb_addr_map,"Go",@llvm_bb_addr_map,_Z4fooTIiET_v,comdat,.text._Z4fooTIiET_v{{$}}			; CHECK: .section .llvm_bb_addr_map.v1,"Go",@llvm_bb_addr_map,_Z4fooTIiET_v,comdat,.text._Z4fooTIiET_v{{$}}
	; CHECK-NEXT: .quad [[FOOCOMDAT_BEGIN]]			; CHECK-NEXT: .quad [[FOOCOMDAT_BEGIN]]
				jhendersonUnsubmitted Done Reply Inline Actions If you're adding the comment here, I'd also add it to the other cases above (plus it makes it more robust, since it reduces the chance of spurious matches) jhenderson: If you're adding the comment here, I'd also add it to the other cases above (plus it makes it…

llvm/test/CodeGen/X86/basic-block-sections-labels.ll

	Show All 37 Lines
	; CHECK-LABEL: .LBB0_1:			; CHECK-LABEL: .LBB0_1:
	; CHECK-LABEL: .LBB_END0_1:			; CHECK-LABEL: .LBB_END0_1:
	; CHECK-LABEL: .LBB0_2:			; CHECK-LABEL: .LBB0_2:
	; CHECK-LABEL: .LBB_END0_2:			; CHECK-LABEL: .LBB_END0_2:
	; CHECK-LABEL: .LBB0_3:			; CHECK-LABEL: .LBB0_3:
	; CHECK-LABEL: .LBB_END0_3:			; CHECK-LABEL: .LBB_END0_3:
	; CHECK-LABEL: .Lfunc_end0:			; CHECK-LABEL: .Lfunc_end0:

	; UNIQ: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text._Z3bazb{{$}}			; UNIQ: .section .llvm_bb_addr_map.v1,"o",@llvm_bb_addr_map,.text._Z3bazb{{$}}
	;; Verify that with -unique-section-names=false, the unique id of the text section gets assigned to the llvm_bb_addr_map section.			;; Verify that with -unique-section-names=false, the unique id of the text section gets assigned to the llvm_bb_addr_map section.
	; NOUNIQ: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text,unique,1			; NOUNIQ: .section .llvm_bb_addr_map.v1,"o",@llvm_bb_addr_map,.text,unique,1
	; CHECK-NEXT: .quad .Lfunc_begin0			; CHECK-NEXT: .quad .Lfunc_begin0
	; CHECK-NEXT: .byte 4			; CHECK-NEXT: .byte 4
				jhendersonUnsubmitted Done Reply Inline Actions It would be good if these could have comments in the asm indicating what they represent (i.e. version and feature), for those not familiar with the format. jhenderson: It would be good if these could have comments in the asm indicating what they represent (i.e.
	; CHECK-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0			; CHECK-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0
	; CHECK-NEXT: .uleb128 .LBB_END0_0-.Lfunc_begin0			; CHECK-NEXT: .uleb128 .LBB_END0_0-.Lfunc_begin0
	; CHECK-NEXT: .byte 8			; CHECK-NEXT: .byte 8
	; CHECK-NEXT: .uleb128 .LBB0_1-.Lfunc_begin0			; CHECK-NEXT: .uleb128 .LBB0_1-.LBB_END0_0
	; CHECK-NEXT: .uleb128 .LBB_END0_1-.LBB0_1			; CHECK-NEXT: .uleb128 .LBB_END0_1-.LBB0_1
	; CHECK-NEXT: .byte 8			; CHECK-NEXT: .byte 8
	; CHECK-NEXT: .uleb128 .LBB0_2-.Lfunc_begin0			; CHECK-NEXT: .uleb128 .LBB0_2-.LBB_END0_1
	; CHECK-NEXT: .uleb128 .LBB_END0_2-.LBB0_2			; CHECK-NEXT: .uleb128 .LBB_END0_2-.LBB0_2
	; CHECK-NEXT: .byte 1			; CHECK-NEXT: .byte 1
	; CHECK-NEXT: .uleb128 .LBB0_3-.Lfunc_begin0			; CHECK-NEXT: .uleb128 .LBB0_3-.LBB_END0_2
	; CHECK-NEXT: .uleb128 .LBB_END0_3-.LBB0_3			; CHECK-NEXT: .uleb128 .LBB_END0_3-.LBB0_3
	; CHECK-NEXT: .byte 5			; CHECK-NEXT: .byte 5

llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test

## This test checks how we handle the --bb-addr-map option. ## This test checks how we handle the --bb-addr-map option.

# Check 64-bit: ## Check 64-bit (version #0 encoding):

# RUN: yaml2obj %s -DBITS=64 -DADDR=0x999999999 -o %t1.x64.o # RUN: yaml2obj %s -DBITS=64 -DADDR=0x999999999 -o %t1.x64.o

# RUN: llvm-readobj %t1.x64.o --bb-addr-map 2>&1 | FileCheck %s -DADDR=0x999999999 -DFILE=%t1.x64.o --check-prefix=LLVM # RUN: llvm-readobj %t1.x64.o --bb-addr-map 2>&1 | FileCheck %s -DADDR=0x999999999 -DFILE=%t1.x64.o --check-prefixes=CHECK,V0

# RUN: llvm-readelf %t1.x64.o --bb-addr-map | FileCheck %s --check-prefix=GNU # RUN: llvm-readelf %t1.x64.o --bb-addr-map | FileCheck %s --check-prefix=GNU

## Check 64-bit (version #1 encoding):

# RUN: yaml2obj %s -DBITS=64 -DVERSION=1 -DADDR=0x999999999 -o %t1.v1.x64.o

# RUN: llvm-readobj %t1.v1.x64.o --bb-addr-map 2>&1 | FileCheck %s -DADDR=0x999999999 -DFILE=%t1.v1.x64.o --check-prefixes=CHECK,V1

## Check 32-bit: ## Check 32-bit:

# RUN: yaml2obj %s -DBITS=32 -o %t1.x32.o # RUN: yaml2obj %s -DBITS=32 -o %t1.x32.o

# RUN: llvm-readobj %t1.x32.o --bb-addr-map 2>&1 | FileCheck -DADDR=0x11111 %s -DFILE=%t1.x32.o --check-prefix=LLVM # RUN: llvm-readobj %t1.x32.o --bb-addr-map 2>&1 | FileCheck -DADDR=0x11111 %s -DFILE=%t1.x32.o --check-prefixes=CHECK,V0

# RUN: llvm-readelf %t1.x32.o --bb-addr-map | FileCheck %s --check-prefix=GNU # RUN: llvm-readelf %t1.x32.o --bb-addr-map | FileCheck %s --check-prefix=GNU

## Check that a malformed section can be handled. ## Check that a malformed section can be handled.

# RUN: yaml2obj %s -DBITS=32 -DSIZE=4 -o %t2.o # RUN: yaml2obj %s -DBITS=32 -DSIZE=4 -o %t2.o

# RUN: llvm-readobj %t2.o --bb-addr-map 2>&1 | FileCheck %s -DOFFSET=0x00000004 -DFILE=%t2.o --check-prefix=TRUNCATED # RUN: llvm-readobj %t2.o --bb-addr-map 2>&1 | FileCheck %s -DOFFSET=0x00000004 -DFILE=%t2.o --check-prefix=TRUNCATED

# LLVM: BBAddrMap [ # CHECK: BBAddrMap [

# LLVM-NEXT: Function { # CHECK-NEXT: Function {

# LLVM-NEXT: At: [[ADDR]] # CHECK-NEXT: At: [[ADDR]]

# LLVM-NEXT: warning: '[[FILE]]': could not identify function symbol for address ([[ADDR]]) in SHT_LLVM_BB_ADDR_MAP section with index 3 # CHECK-NEXT: warning: '[[FILE]]': could not identify function symbol for address ([[ADDR]]) in SHT_LLVM_BB_ADDR_MAP section with index 3

# LLVM-NEXT: Name: <?> # CHECK-NEXT: Name: <?>

# LLVM-NEXT: BB entries [ # CHECK-NEXT: BB entries [

# LLVM-NEXT: { # CHECK-NEXT: {

# LLVM-NEXT: Offset: 0x0 # CHECK-NEXT: Offset: 0x0

# LLVM-NEXT: Size: 0x1 # CHECK-NEXT: Size: 0x1

# LLVM-NEXT: HasReturn: No # CHECK-NEXT: HasReturn: No

# LLVM-NEXT: HasTailCall: Yes # CHECK-NEXT: HasTailCall: Yes

# LLVM-NEXT: IsEHPad: No # CHECK-NEXT: IsEHPad: No

# LLVM-NEXT: CanFallThrough: No # CHECK-NEXT: CanFallThrough: No

# LLVM-NEXT: } # CHECK-NEXT: }

# LLVM-NEXT: { # CHECK-NEXT: {

# LLVM-NEXT: Offset: 0x3 # CHECK-NEXT: Offset: 0x3

# LLVM-NEXT: Size: 0x4 # CHECK-NEXT: Size: 0x4

# LLVM-NEXT: HasReturn: Yes # CHECK-NEXT: HasReturn: Yes

# LLVM-NEXT: HasTailCall: No # CHECK-NEXT: HasTailCall: No

# LLVM-NEXT: IsEHPad: Yes # CHECK-NEXT: IsEHPad: Yes

# LLVM-NEXT: CanFallThrough: No # CHECK-NEXT: CanFallThrough: No

# LLVM-NEXT: } # CHECK-NEXT: }

# LLVM-NEXT: ] # CHECK-NEXT: ]

# LLVM-NEXT: } # CHECK-NEXT: }

# LLVM-NEXT: Function { # CHECK-NEXT: Function {

# LLVM-NEXT: At: 0x22222 # CHECK-NEXT: At: 0x22222

# LLVM-NEXT: Name: foo # CHECK-NEXT: Name: foo

# LLVM-NEXT: BB entries [ # CHECK-NEXT: BB entries [

# LLVM-NEXT: { # CHECK-NEXT: {

# LLVM-NEXT: Offset: 0x6 # CHECK-NEXT: Offset: 0x6

# LLVM-NEXT: Size: 0x7 # CHECK-NEXT: Size: 0x7

# LLVM-NEXT: HasReturn: No # CHECK-NEXT: HasReturn: No

# LLVM-NEXT: HasTailCall: No # CHECK-NEXT: HasTailCall: No

# LLVM-NEXT: IsEHPad: No # CHECK-NEXT: IsEHPad: No

# LLVM-NEXT: CanFallThrough: Yes # CHECK-NEXT: CanFallThrough: Yes

# LLVM-NEXT: } # CHECK-NEXT: }

# LLVM-NEXT: ] # CHECK-NEXT: ]

# LLVM-NEXT: } # CHECK-NEXT: }

# LLVM-NEXT: ] # CHECK-NEXT: ]

# LLVM-NEXT: BBAddrMap [ # CHECK-NEXT: BBAddrMap [

# LLVM-NEXT: Function { # CHECK-NEXT: Function {

jhendersonUnsubmitted

Done

# CHECK-NEXT: ]

- ## Check that the using the SHT_LLVM_BB_ADDR_MAP_V0 section type generates

+ ## Check that using the SHT_LLVM_BB_ADDR_MAP_V0 section type generates

## the same result as the SHT_LLVM_BB_ADDR_MAP type with Version=0.

jhenderson:

# LLVM-NEXT: At: 0x33333 # CHECK-NEXT: At: 0x33333

# LLVM-NEXT: Name: bar # CHECK-NEXT: Name: bar

# LLVM-NEXT: BB entries [ # CHECK-NEXT: BB entries [

# LLVM-NEXT: { # CHECK-NEXT: {

# LLVM-NEXT: Offset: 0x9 # CHECK-NEXT: Offset: 0x9

# LLVM-NEXT: Size: 0xA # CHECK-NEXT: Size: 0xA

# LLVM-NEXT: HasReturn: Yes # CHECK-NEXT: HasReturn: Yes

# LLVM-NEXT: HasTailCall: Yes # CHECK-NEXT: HasTailCall: Yes

# LLVM-NEXT: IsEHPad: No # CHECK-NEXT: IsEHPad: No

# LLVM-NEXT: CanFallThrough: Yes # CHECK-NEXT: CanFallThrough: Yes

# LLVM-NEXT: } # CHECK-NEXT: }

# LLVM-NEXT: ] # CHECK-NEXT: {

# LLVM-NEXT: } # V0-NEXT: Offset: 0xC

# LLVM-NEXT: ] # V1-NEXT: Offset: 0x1F

jhendersonUnsubmitted

Done

For V1 output, I feel like it would be useful to have both the raw offset and the calculated offset printed. I'm not sure exactly what would be the best way of doing that though.

jhenderson: For V1 output, I feel like it would be useful to have both the raw offset and the calculated…

rahmanlAuthorUnsubmitted

Done

I think we should only care about the final calculated offset for verification. The raw offset is just an encoding technicality and should not be given much semantic importance.

rahmanl: I think we should only care about the final calculated offset for verification. The raw offset…

# CHECK-NEXT: Size: 0xD

# CHECK-NEXT: HasReturn: No

# CHECK-NEXT: HasTailCall: Yes

# CHECK-NEXT: IsEHPad: Yes

# CHECK-NEXT: CanFallThrough: Yes

# CHECK-NEXT: }

# CHECK-NEXT: ]

# CHECK-NEXT: }

# CHECK-NEXT: ]

# GNU: GNUStyle::printBBAddrMaps not implemented # GNU: GNUStyle::printBBAddrMaps not implemented

jhendersonUnsubmitted

Done

This didn't occur to me until now, but it's unfortunate that we have to have duplicate check patterns and near-duplicate YAML to do the v0 comparison check. I believe we can avoid it as follows:

Have an additional YAML file that just describes the section, with the Type (and potentially Version) field parameterised.
Create two ELF objects from this YAML, one with each of the two section types, the newer type having an explicit Version 0.
Run llvm-readobj twice, to dump each of them individually.
Use the same check pattern for the pair of these invocations.

What do you think?

jhenderson: This didn't occur to me until now, but it's unfortunate that we have to have duplicate check…

# TRUNCATED: BBAddrMap [ # TRUNCATED: BBAddrMap [

# TRUNCATED-NEXT: warning: '[[FILE]]': unable to dump SHT_LLVM_BB_ADDR_MAP section with index 3: unable to decode LEB128 at offset [[OFFSET]]: malformed uleb128, extends past end # TRUNCATED-NEXT: warning: '[[FILE]]': unable to dump SHT_LLVM_BB_ADDR_MAP section with index 3: unable to decode LEB128 at offset [[OFFSET]]: malformed uleb128, extends past end

# TRUNCATED-NEXT: ] # TRUNCATED-NEXT: ]

## Check that the other valid section is properly dumped. ## Check that the other valid section is properly dumped.

# TRUNCATED-NEXT: BBAddrMap [ # TRUNCATED-NEXT: BBAddrMap [

# TRUNCATED-NEXT: Function { # TRUNCATED-NEXT: Function {

# TRUNCATED-NEXT: At: 0x33333 # TRUNCATED-NEXT: At: 0x33333

# TRUNCATED-NEXT: Name: bar # TRUNCATED-NEXT: Name: bar

# TRUNCATED-NEXT: BB entries [ # TRUNCATED-NEXT: BB entries [

# TRUNCATED-NEXT: { # TRUNCATED-NEXT: {

# TRUNCATED-NEXT: Offset: 0x9 # TRUNCATED-NEXT: Offset: 0x9

# TRUNCATED-NEXT: Size: 0xA # TRUNCATED-NEXT: Size: 0xA

# TRUNCATED-NEXT: HasReturn: Yes # TRUNCATED-NEXT: HasReturn: Yes

# TRUNCATED-NEXT: HasTailCall: Yes # TRUNCATED-NEXT: HasTailCall: Yes

# TRUNCATED-NEXT: IsEHPad: No # TRUNCATED-NEXT: IsEHPad: No

# TRUNCATED-NEXT: CanFallThrough: Yes # TRUNCATED-NEXT: CanFallThrough: Yes

# TRUNCATED-NEXT: } # TRUNCATED-NEXT: }

# TRUNCATED-NEXT: {

# TRUNCATED-NEXT: Offset: 0xC

# TRUNCATED-NEXT: Size: 0xD

# TRUNCATED-NEXT: HasReturn: No

# TRUNCATED-NEXT: HasTailCall: Yes

# TRUNCATED-NEXT: IsEHPad: Yes

# TRUNCATED-NEXT: CanFallThrough: Yes

# TRUNCATED-NEXT: }

# TRUNCATED-NEXT: ] # TRUNCATED-NEXT: ]

# TRUNCATED-NEXT: } # TRUNCATED-NEXT: }

# TRUNCATED-NEXT: ] # TRUNCATED-NEXT: ]

--- !ELF --- !ELF

FileHeader: FileHeader:

Class: ELFCLASS[[BITS]] Class: ELFCLASS[[BITS]]

Data: ELFDATA2LSB Data: ELFDATA2LSB

Type: ET_EXEC Type: ET_EXEC

Sections: Sections:

- Name: .text - Name: .text

Type: SHT_PROGBITS Type: SHT_PROGBITS

Flags: [SHF_ALLOC] Flags: [SHF_ALLOC]

- Name: .text.bar - Name: .text.bar

Type: SHT_PROGBITS Type: SHT_PROGBITS

Flags: [SHF_ALLOC] Flags: [SHF_ALLOC]

- Name: bb_addr_map_1 - Name: .llvm_bb_addr_map

Type: SHT_LLVM_BB_ADDR_MAP Type: SHT_LLVM_BB_ADDR_MAP

ShSize: [[SIZE=<none>]] ShSize: [[SIZE=<none>]]

Link: .text Link: .text

Entries: Entries:

- Address: [[ADDR=0x11111]] - Address: [[ADDR=0x11111]]

BBEntries: BBEntries:

- AddressOffset: 0x0 - AddressOffset: 0x0

Size: 0x1 Size: 0x1

Metadata: 0xF0000002 Metadata: 0xF0000002

- AddressOffset: 0x3 - AddressOffset: 0x3

Size: 0x4 Size: 0x4

Metadata: 0x5 Metadata: 0x5

- Address: 0x22222 - Address: 0x22222

BBEntries: BBEntries:

- AddressOffset: 0x6 - AddressOffset: 0x6

Size: 0x7 Size: 0x7

Metadata: 0x8 Metadata: 0x8

- Name: dummy_section - Name: dummy_section

Type: SHT_PROGBITS Type: SHT_PROGBITS

Size: 16 Size: 16

- Name: bb_addr_map_2 - Name: .llvm_bb_addr_map.v[[VERSION=0]]

Type: SHT_LLVM_BB_ADDR_MAP Type: SHT_LLVM_BB_ADDR_MAP

Link: .text.bar Link: .text.bar

jhendersonUnsubmitted

Done

FWIW, there are still 2 spaces here, rather than just 1.

jhenderson: FWIW, there are still 2 spaces here, rather than just 1.

Entries: Entries:

- Address: 0x33333 - Address: 0x33333

BBEntries: BBEntries:

- AddressOffset: 0x9 - AddressOffset: 0x9

Size: 0xa Size: 0xa

Metadata: 0xb Metadata: 0xb

- AddressOffset: 0xc

Size: 0xd

Metadata: 0xe

Symbols: Symbols:

- Name: foo - Name: foo

Section: .text Section: .text

Type: STT_FUNC Type: STT_FUNC

jhendersonUnsubmitted

Done

I wonder if it would be better to link against the same section? This would allow you to compare the differences more easily.

jhenderson: I wonder if it would be better to link against the same section? This would allow you to…

Value: 0x22222 Value: 0x22222

- Name: bar - Name: bar

Section: .text.bar Section: .text.bar

jhendersonUnsubmitted

Done

Ah, that's an unfortunate side-effect. I think we should aim to avoid it somehow. About the best idea I have for this is to use different struct types in the ELFYAML code for SHT_LLVM_BB_ADDR_MAP_V0 entries and those in SHT_LLVM_BB_ADDR_MAP sections. This also means you can't set Features when it doesn't make sense (which is a good thing).

jhenderson: Ah, that's an unfortunate side-effect. I think we should aim to avoid it somehow. About the…

rahmanlAuthorUnsubmitted

Done

The problem is I would have to add alternative structs for BBAddrMapSection and BBAddrMapEntry and also define new mapping functions and writeSectionContent (with mostly identical code) for the SHT_LLVM_BB_ADDR_MAP_V0 type. We should be able to fully deprecate SHT_LLVM_BB_ADDR_MAP_V0 in a few months. So maybe this test won't stay around for too long. Of course, future versions will still use the same SHT_LLVM_BB_ADDR_MAP section type and therefore, new YAML fields will be optional (even if they are required for the new versions). So we won't have a major issue. Can I keep it as is?

rahmanl: The problem is I would have to add alternative structs for `BBAddrMapSection` and…

jhendersonUnsubmitted

Done

Yeah, leave as-is. Thanks for the explanation.

jhenderson: Yeah, leave as-is. Thanks for the explanation.

Type: STT_FUNC Type: STT_FUNC

Value: 0x33333 Value: 0x33333

jhendersonUnsubmitted

Done

Nit: double blank line.

jhenderson: Nit: double blank line.

jhendersonUnsubmitted

Done

Super Nit: here and throughout, --check-prefixes -> --check-prefix when there's only one prefix to check (optional though - if you prefer to leave as-is, that's fine).

jhenderson: Super Nit: here and throughout, --check-prefixes -> --check-prefix when there's only one prefix…

jhendersonUnsubmitted

Done

Nit: spurious extra line?

jhenderson: Nit: spurious extra line?

llvm/unittests/Object/ELFObjectFileTest.cpp

Show First 20 Lines • Show All 499 Lines • ▼ Show 20 Lines
TEST(ELFObjectFileTest, InvalidDecodeBBAddrMap) {		TEST(ELFObjectFileTest, InvalidDecodeBBAddrMap) {
StringRef CommonYamlString(R"(		StringRef CommonYamlString(R"(
--- !ELF		--- !ELF
FileHeader:		FileHeader:
Class: ELFCLASS64		Class: ELFCLASS64
Data: ELFDATA2LSB		Data: ELFDATA2LSB
Type: ET_EXEC		Type: ET_EXEC
Sections:		Sections:
- Name: .llvm_bb_addr_map		- Type: SHT_LLVM_BB_ADDR_MAP
Type: SHT_LLVM_BB_ADDR_MAP
Entries:
- Address: 0x11111
BBEntries:
- AddressOffset: 0x0
Size: 0x1
Metadata: 0x2
)");		)");

auto DoCheck = [&](StringRef YamlString, const char *ErrMsg) {		auto DoCheck = [&](StringRef YamlString, const char *ErrMsg) {
SmallString<0> Storage;		SmallString<0> Storage;
Expected<ELFObjectFile<ELF64LE>> ElfOrErr =		Expected<ELFObjectFile<ELF64LE>> ElfOrErr =
toBinary<ELF64LE>(Storage, YamlString);		toBinary<ELF64LE>(Storage, YamlString);
ASSERT_THAT_EXPECTED(ElfOrErr, Succeeded());		ASSERT_THAT_EXPECTED(ElfOrErr, Succeeded());
const ELFFile<ELF64LE> &Elf = ElfOrErr->getELFFile();		const ELFFile<ELF64LE> &Elf = ElfOrErr->getELFFile();

Expected<const typename ELF64LE::Shdr *> BBAddrMapSecOrErr =		Expected<const typename ELF64LE::Shdr *> BBAddrMapSecOrErr =
Elf.getSection(1);		Elf.getSection(1);
ASSERT_THAT_EXPECTED(BBAddrMapSecOrErr, Succeeded());		ASSERT_THAT_EXPECTED(BBAddrMapSecOrErr, Succeeded());
EXPECT_THAT_ERROR(Elf.decodeBBAddrMap(**BBAddrMapSecOrErr).takeError(),		EXPECT_THAT_ERROR(Elf.decodeBBAddrMap(**BBAddrMapSecOrErr).takeError(),
FailedWithMessage(ErrMsg));		FailedWithMessage(ErrMsg));
};		};

		// Check that we can detect invalid and unsupported versions.
		SmallVector<SmallString<128>, 2> InvalidVersionYamlStrings(2,
		CommonYamlString);
		InvalidVersionYamlStrings[0] += R"(
		Name: .llvm_bb_addr_map.vx
		)";
		InvalidVersionYamlStrings[1] += R"(
		Name: .llvm_bb_addr_map.v2
		)";

		DoCheck(InvalidVersionYamlStrings[0],
		"Unable to parse bb-address-map version suffix: vx");
		DoCheck(InvalidVersionYamlStrings[1],
		"Unsupported bb-address-map version: 2");

		SmallString<128> CommonValidYamlString(CommonYamlString);
		CommonValidYamlString += R"(
		Name: .llvm_bb_addr_map.v1
		Entries:
		- Address: 0x11111
		BBEntries:
		- AddressOffset: 0x0
		Size: 0x1
		Metadata: 0x2
		)";

// Check that we can detect the malformed encoding when the section is		// Check that we can detect the malformed encoding when the section is
// truncated.		// truncated.
SmallString<128> TruncatedYamlString(CommonYamlString);		SmallString<128> TruncatedYamlString(CommonValidYamlString);
TruncatedYamlString += R"(		TruncatedYamlString += R"(
ShSize: 0x8		ShSize: 0x8
)";		)";
DoCheck(TruncatedYamlString, "unable to decode LEB128 at offset 0x00000008: "		DoCheck(TruncatedYamlString, "unable to decode LEB128 at offset 0x00000008: "
"malformed uleb128, extends past end");		"malformed uleb128, extends past end");

// Check that we can detect when the encoded BB entry fields exceed the UINT32		// Check that we can detect when the encoded BB entry fields exceed the UINT32
// limit.		// limit.
SmallVector<SmallString<128>, 3> OverInt32LimitYamlStrings(3,		SmallVector<SmallString<128>, 3> OverInt32LimitYamlStrings(
CommonYamlString);		3, CommonValidYamlString);
OverInt32LimitYamlStrings[0] += R"(		OverInt32LimitYamlStrings[0] += R"(
- AddressOffset: 0x100000000		- AddressOffset: 0x100000000
Size: 0xFFFFFFFF		Size: 0xFFFFFFFF
Metadata: 0xFFFFFFFF		Metadata: 0xFFFFFFFF
)";		)";

OverInt32LimitYamlStrings[1] += R"(		OverInt32LimitYamlStrings[1] += R"(
- AddressOffset: 0xFFFFFFFF		- AddressOffset: 0xFFFFFFFF
Show All 37 Lines	DoCheck(OverInt32LimitAndTruncated[0],
"extends past end");		"extends past end");
DoCheck(OverInt32LimitAndTruncated[1],		DoCheck(OverInt32LimitAndTruncated[1],
"ULEB128 value at offset 0x11 exceeds UINT32_MAX (0x100000000)");		"ULEB128 value at offset 0x11 exceeds UINT32_MAX (0x100000000)");
DoCheck(OverInt32LimitAndTruncated[2],		DoCheck(OverInt32LimitAndTruncated[2],
"ULEB128 value at offset 0x11 exceeds UINT32_MAX (0x100000000)");		"ULEB128 value at offset 0x11 exceeds UINT32_MAX (0x100000000)");

// Check for proper error handling when the 'NumBlocks' field is overridden		// Check for proper error handling when the 'NumBlocks' field is overridden
// with an out-of-range value.		// with an out-of-range value.
SmallString<128> OverLimitNumBlocks(CommonYamlString);		SmallString<128> OverLimitNumBlocks(CommonValidYamlString);
OverLimitNumBlocks += R"(		OverLimitNumBlocks += R"(
NumBlocks: 0x100000000		NumBlocks: 0x100000000
)";		)";

DoCheck(OverLimitNumBlocks,		DoCheck(OverLimitNumBlocks,
"ULEB128 value at offset 0x8 exceeds UINT32_MAX (0x100000000)");		"ULEB128 value at offset 0x8 exceeds UINT32_MAX (0x100000000)");
}		}

▲ Show 20 Lines • Show All 180 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks.ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 429861

llvm/docs/Extensions.rst

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

llvm/lib/CodeGen/BasicBlockSections.cpp

llvm/lib/MC/MCObjectFileInfo.cpp

llvm/lib/Object/ELF.cpp

llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll

llvm/test/CodeGen/X86/basic-block-sections-labels-functions-sections.ll

llvm/test/CodeGen/X86/basic-block-sections-labels.ll

llvm/test/tools/llvm-readobj/ELF/bb-addr-map.test

llvm/unittests/Object/ELFObjectFileTest.cpp

[Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks.
ClosedPublic