This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/
-
ELF/
16/16
Driver.cpp
4/4
InputFiles.h
1/1
InputFiles.cpp
-
test/ELF/
-
ELF/
7/8
cgprofile-rela.test
-
llvm/
-
test/tools/llvm-readobj/ELF/
-
tools/
-
llvm-readobj/
-
ELF/
9/9
call-graph-profile.test
-
tools/llvm-readobj/
-
llvm-readobj/
6/6
ELFDumper.cpp

Differential D105217

[LLD] Adding support for RELA for CG Profile.
ClosedPublic

Authored by ayermolo on Jun 30 2021, 11:28 AM.

Download Raw Diff

Details

Reviewers

MaskRay
jhenderson

Commits

rG24129fbc9aa0: [LLD] Adding support for RELA for CG Profile.

Summary

This is a follow up to https://reviews.llvm.org/D104080, and https://github.com/llvm/llvm-project/commit/ca3bdb57fa1ac98b711a735de048c12b5fdd8086#diff-e64a48fabe31db213a631fdc5f2acb51bdddf3f16a8fb2928784f4c579229585. The implementation of call graph profile was changed from a black box section to relocation approach. This was done to be compatible with post processing tools like strip/objcopy, and llvm equivalent. When they are invoked on object file before the final linking step with this new approach the symbol indices correctness is preserved.

The GNU binutils tools change the REL section to RELA section, unlike llvm tools. For example when strip -S is run on the ELF object files, as an intermediate step before linking. To preserve compatibility this patch extends implementation in LLD and ELFDumper to support both REL and RELA sections for call graph profile.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ayermolo created this revision.Jun 30 2021, 11:28 AM

Herald added subscribers: hoy, modimo, wenlei and 2 others. · View Herald TranscriptJun 30 2021, 11:28 AM

ayermolo requested review of this revision.Jun 30 2021, 11:28 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2021, 11:28 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

removing header include that sneaked in somehow

I have changed MC to unconditionally emit REL, so this change is not needed.

In D105217#2850892, @MaskRay wrote:

I have changed MC to unconditionally emit REL, so this change is not needed.

Unfortunately when running strip -S on ELF object files it converts REL to RELA.

Before:

[85] .llvm.call-graph-profile LLVM_CALL_GRAPH_PROFILE 0000000000000000 00db01 000168 08   E 92   0  1
[86] .rel.llvm.call-graph-profile REL  0000000000000000 01ec18 0005a0 10     92  85  8

After:

[76] .llvm.call-graph-profile LLVM_CALL_GRAPH_PROFILE 0000000000000000 0056aa 000168 08   E  0   0  1
[77] .rela.llvm.call-graph-profile RELA 0000000000000000 00dd18 000870 18   I 81  76  8

Supporting usage model of running strip -S on object files before linking was original motivation for this changing CG Call graph to use relocations.

Thanks for the fix. Please add a test case of what a stripped binary would look like (yaml2obj probably easiest way to do this) to make sure this works. Also are there llvm-readobj changes that need to be made?

lld/ELF/Driver.cpp
63	Needed?
858–871	I don't think you need this vector, the getIndex lambda can be generalized as a templated lambda/function based on ArrayRef
906	Add a warning + bailout if a binary every has both for some weird reason?
lld/ELF/InputFiles.h
251–252	Fix up comments to reflect that there is "always" here now.

ayermolo added inline comments.Jun 30 2021, 11:43 AM

lld/ELF/Driver.cpp
858–871	I wanted to doing a rel/rela check on each invocation of helper function.

I feel uneasy with the additional complexity. I am not sure we want this just to make strip -S happy. Can you use llvm-strip -S?

MaskRay requested changes to this revision.Jun 30 2021, 11:48 AM

This revision now requires changes to proceed.Jun 30 2021, 11:48 AM

In D105217#2850944, @MaskRay wrote:

I feel uneasy with the additional complexity. I am not sure we want this just to make strip -S happy. Can you use llvm-strip -S?

Although llvm-strip does have the "correct/expected" behavior I think it's a too strict requirement on the users. Specifically I am thinking of large build systems with multiple projects. Right now it will be an assert

if (obj->cgProfileRel.size() != obj->cgProfile.size() * 2)
   fatal("number of relocations doesn't match Weights");

Which in context is not a descriptive error message.
Owners of projects will need to dig in to LLD source code, and change their workflow.

Also strip and objcopy are standard GNU tools that I think used interchangeably with llvm tools, and it's worth for LLD to continue to support them, and not force users in to using llvm tools.

I understand that this ads complexity, but I think the benefit of enabling support for more tools and reducing work on users of lld out-weights it.

ayermolo added a reviewer: jhenderson.Jun 30 2021, 12:45 PM

Why is strip -S is part of the step running on intermediate object files?

The usually operation is to do strip only on linked images. If you want to have smaller intermediate object files, you may use -gsplit-dwarf or -gsplit-dwarf=single.

Harbormaster completed remote builds in B111819: Diff 355643.Jun 30 2021, 1:08 PM

In D105217#2851133, @MaskRay wrote:

Why is strip -S is part of the step running on intermediate object files?

The usually operation is to do strip only on linked images. If you want to have smaller intermediate object files, you may use -gsplit-dwarf or -gsplit-dwarf=single.

Good question. My understanding of the usage model after talking to someone knowledgable in build system is that there are two issues. First we have a large set of pre-build libraries that various projects link against and they are build with monolithic debug information. Second is how caching of object files works within the build system. It allows flexibility for projects to re-use objects with and without debug information. Second part can be mitigated if build system supported dwo files, but right now it doesn't.
There is another potential issue. For example if there is some post processing tool for object files/binaries that doesn't support split dwarf. As an example until recently BOLT project only supported monolithic debug information.

First, .llvm.call-graph-profile is only emitted by -fprofile-use= and -fprofile-sample-use=, instrumentation PGO and sample PGO.
The use case is very specific. For sample PGO/BOLT there are other requirements from an external tool converting linux-perf data to a profile format recognized by llvm-project.
I can imagine that BOLT may have more requirements on LLVM tooling.
So I don't see why requiring llvm-strip will be another hindrance.
Also keep in mind that GNU binutils doesn't recognize the section type SHT_LLVM_*. (Folks have added dumping support to llvm-readobj.)

Second is how caching of object files works within the build system. It allows flexibility for projects to re-use objects with and without debug information.

If linker input size is not a concern, you may use ld -S instead of strip -S on .o files.

I don't think the additional complexity is all that much, especially given there's an actual use-case for it. It'd be different if the case was some hypothetical situation, but it isn't. Not all systems have all of LLVM installed as their toolchain, so forcing people away from using GNU tools seems like the wrong approach to me.

In D105217#2851379, @MaskRay wrote:

First, .llvm.call-graph-profile is only emitted by -fprofile-use= and -fprofile-sample-use=, instrumentation PGO and sample PGO.
The use case is very specific. For sample PGO/BOLT there are other requirements from an external tool converting linux-perf data to a profile format recognized by llvm-project.
I can imagine that BOLT may have more requirements on LLVM tooling.
So I don't see why requiring llvm-strip will be another hindrance.
Also keep in mind that GNU binutils doesn't recognize the section type SHT_LLVM_*. (Folks have added dumping support to llvm-readobj.)

Second is how caching of object files works within the build system. It allows flexibility for projects to re-use objects with and without debug information.

If linker input size is not a concern, you may use ld -S instead of strip -S on .o files.

Linker input size is a concern. But also moving all the object files (gigs worth) around from machines where they are cached to ones where final linking step is happening.
I brought up BOLT just in context of debug fission, and that it's not always possible to enable it. Depending on the work flow. Not as an argument for this patch. Sorry for confusion.
Speaking of distributed build environment. As James pointed out, they might not have full llvm toolchain installed. I understand that this ads complexity, but it's very small and it enables maximum compatibility of LLD with standard tools that is used in production.

I filed a binutils bug report: https://sourceware.org/bugzilla/show_bug.cgi?id=28035 Perhaps you can chime in?
The support needs a comment that this is specifically for GNU strip.

That said, my main objection is that we now waste additional 16 bytes for InputFile. Can it be avoided?

lld/ELF/Driver.cpp
864	drop braces

In D105217#2851379, @MaskRay wrote:

First, .llvm.call-graph-profile is only emitted by -fprofile-use= and -fprofile-sample-use=, instrumentation PGO and sample PGO.
The use case is very specific.

Not sure if I follow this. If what you meant is we don't need to have good support for it since this call graph profile is not used by everyone... I'm afraid I don't agree: 1. We used this broadly for hundreds of workloads. 2. Toolchains have many features most of which aren't used by many, but as long as those who use them are willing to maintain and improve it, why would we want to block such work just because it's not a mainstream feature?

For sample PGO/BOLT there are other requirements from an external tool converting linux-perf data to a profile format recognized by llvm-project.
I can imagine that BOLT may have more requirements on LLVM tooling.
So I don't see why requiring llvm-strip will be another hindrance.

This is more of a build system issue - requiring all gnu tools to be replaced by llvm tools in any large organization is going to be very difficult. I think it's reasonable to support some level of compatibility if the complexity isn't high.

In fact, there're many precedences for that. In LLD, we have a list of silently ignored switches for compatibility. We could insist build system changes for all instead of accommodate them in toolchain, but we favored compatibility.

You mentioned extra complexity, while I agree that if possible simpler implementation and tighter contract is better, but the trade off here between very minor complexity and compatibility seem not very different from those silently ignored switches.

Also keep in mind that GNU binutils doesn't recognize the section type SHT_LLVM_*. (Folks have added dumping support to llvm-readobj.)

Second is how caching of object files works within the build system. It allows flexibility for projects to re-use objects with and without debug information.

If linker input size is not a concern, you may use ld -S instead of strip -S on .o files.

In D105217#2852279, @jhenderson wrote:

I don't think the additional complexity is all that much, especially given there's an actual use-case for it. It'd be different if the case was some hypothetical situation, but it isn't. Not all systems have all of LLVM installed as their toolchain, so forcing people away from using GNU tools seems like the wrong approach to me.

Exactly. As mentioned above, we have precedence for supporting some level of compatibility. Insisting on simplicity and closed toolchain would do more harm than good.

In D105217#2855739, @wenlei wrote:

In D105217#2851379, @MaskRay wrote:

First, .llvm.call-graph-profile is only emitted by -fprofile-use= and -fprofile-sample-use=, instrumentation PGO and sample PGO.
The use case is very specific.

Not sure if I follow this. If what you meant is we don't need to have good support for it since this call graph profile is not used by everyone...

No.

I'm afraid I don't agree: 1. We used this broadly for hundreds of workloads. 2. Toolchains have many features most of which aren't used by many, but as long as those who use them are willing to maintain and improve it, why would we want to block such work just because it's not a mainstream feature?

I meant: some features have some particular requirement. We don't want to add support for something nobody uses.
For this point, I saw this patch so I knew you are going to use it. If it didn't add 16-bytes to each InputFile I would be ok with it.
However, it has such overhead so I need to balance the needs with the costs and want to ensure you would not add the cost if you could fix the build system internally.

In D105217#2852279, @jhenderson wrote:

I don't think the additional complexity is all that much, especially given there's an actual use-case for it. It'd be different if the case was some hypothetical situation, but it isn't. Not all systems have all of LLVM installed as their toolchain, so forcing people away from using GNU tools seems like the wrong approach to me.

Exactly. As mentioned above, we have precedence for supporting some level of compatibility. Insisting on simplicity and closed toolchain would do more harm than good.

See my previous comment. My main concern is the size overhead to InputFile. I am thinking we are paying too much for a relatively less useful feature.

The fatal issue can be addressed by, e.g. emitting a warning instead or ignoring the section completely.

In D105217#2855688, @MaskRay wrote:

I filed a binutils bug report: https://sourceware.org/bugzilla/show_bug.cgi?id=28035 Perhaps you can chime in?
The support needs a comment that this is specifically for GNU strip.

That said, my main objection is that we now waste additional 16 bytes for InputFile. Can it be avoided?

We only waste 16 bytes if someone uses bin utilities vs llvm one. Some users might be OK with that waste vs trying to switch large fleet of servers to use llvm.
Thank you for creating a bug report, but even if it's fixed in the issue there is that servers are using older version of binutils.

To add. From pervious patch. With RELA it was 322KB. Each entry is 56 bytes ( 8 + 24 + 24), so thats 5750 entries. With REL (8 + 16 + 16), that will be 40 bytes per entry, and translates to 230KB.
The binary is ~113MB. So with RELA it's about 0.28%, with REL it's 0.2%.

My main concern has always been: I am not sure ArrayRef<Elf_Rel> cgProfileRel; ArrayRef<Elf_Rela> cgProfileRela; is the correct design.

Previously I was ok because one ArrayRef just added 16 bytes. Now you are taking extra 16 bytes from each InputFile. Given the tiny benefit of this call graph profile optimization, I am not sure this is the good traceoff.

Can't you just not cache cgProfileRela?

Re-did lld part so that relocation sections are not cached.

In D105217#2860630, @MaskRay wrote:

My main concern has always been: I am not sure ArrayRef<Elf_Rel> cgProfileRel; ArrayRef<Elf_Rela> cgProfileRela; is the correct design.

Previously I was ok because one ArrayRef just added 16 bytes. Now you are taking extra 16 bytes from each InputFile. Given the tiny benefit of this call graph profile optimization, I am not sure this is the good traceoff.

Can't you just not cache cgProfileRela?

Oh. I see. Something like this? I guess downside is that linking time will go up slightly because we have to scan through all the sections again.

Harbormaster completed remote builds in B112690: Diff 356823.Jul 6 2021, 3:31 PM

You also need a yaml2obj test case. This is adding new functionality. Any such patch needs a test case.

Oh. I see. Something like this? I guess downside is that linking time will go up slightly because we have to scan through all the sections again.

Yes, this is close. See my inline comment. The downside is likely negligible if you benchmark it... sizeof(InputSection) is ~184 bytes. I would be unhappy if the relatively minor feature takes 15% of its size.

lld/ELF/Driver.cpp
864	Not addressed, but see below: with inlining you can ignore this.
876	section index
886	`<` => `!=`
891	just inline `getIndices` here
lld/ELF/InputFiles.cpp
579–580	Replace `cgProfile` with `cgProfileSectionIndex`. Let `processRelocationsCGSection` (which probably should be renamed to something else, e.g. `processCallGraphRelocations`) bail out if `cgProfileSectionIndex` is 0 (SHN_UNDEF). You can also delete one loop there.

In D105217#2860864, @MaskRay wrote:

You also need a yaml2obj test case. This is adding new functionality. Any such patch needs a test case.

Oh. I see. Something like this? I guess downside is that linking time will go up slightly because we have to scan through all the sections again.

Yes, this is close. See my inline comment. The downside is likely negligible if you benchmark it... sizeof(InputSection) is ~184 bytes. I would be unhappy if the relatively minor feature takes 15% of its size.

Wanted to make sure it's on correct path, before adding tests, and elfDumper needs to be modified.

Updated ELFDumper and a test that uses yaml2obj, plus some other cleanup/feedback incorporation.

Herald added a subscriber: rupprecht. · View Herald TranscriptJul 7 2021, 1:10 PM

MaskRay added inline comments.Jul 7 2021, 1:52 PM

lld/ELF/InputFiles.h
253	uint32_t

MaskRay added inline comments.Jul 7 2021, 1:52 PM

lld/ELF/Driver.cpp
901	This consumes too much stack space. Perhaps `SmallVector<uint32_t, 32>` lld already uses `SmallVector<unsigned, 32>` so using 32 will not increase binary size.
906	Delete the unneeded blank line
llvm/test/tools/llvm-readobj/ELF/call-graph-profile.test
256	Add a test comment: `## GNU strip may convert SHT_REL to SHT_RELA. Test we can handle SHT_RELA.`
259	Delete elf-cg-profile RUN lines. The aliases are tested by dedicated tests, no need for duplicating.
289	Delete if unused or make it match the reality
296	The offsets should match the reality: they relocate the values (0x0, 0x8, 0x10, ...)
llvm/tools/llvm-readobj/ELFDumper.cpp
6727	Drop `else` since we are using the early return pattern.
6733	Add a comment: MC unconditionally produces SHT_REL but GNU strip may convert the format to SHT_RELA (https://sourceware.org/bugzilla/show_bug.cgi?id=28035) I don't expect they will fix this any time soon, but good to have a reference to justify additional complexity here.

Harbormaster completed remote builds in B112857: Diff 357062.Jul 7 2021, 2:16 PM

addressing comments

ayermolo edited the summary of this revision. (Show Details)Jul 7 2021, 3:31 PM

LG, @jhenderson may have more comments

llvm/tools/llvm-readobj/ELFDumper.cpp
6778	drop parens beside `!=`

This revision is now accepted and ready to land.Jul 7 2021, 3:33 PM

Looks like if we run strip -S it changes rel to rela. Changed so that LLD, and ELFDumper, supports both REL and RELA for CG Profile.

Looks like => Make this certain. https://sourceware.org/bugzilla/show_bug.cgi?id=28035

In D105217#2863339, @MaskRay wrote:

Looks like if we run strip -S it changes rel to rela. Changed so that LLD, and ELFDumper, supports both REL and RELA for CG Profile.

Looks like => Make this certain. https://sourceware.org/bugzilla/show_bug.cgi?id=28035

My bad. Just my "verbal tic" :)

ayermolo edited the summary of this revision. (Show Details)Jul 7 2021, 3:37 PM

Addressing () comment.

Harbormaster completed remote builds in B112886: Diff 357097.Jul 7 2021, 4:19 PM

Test case(s) for the LLD portion of this patch?

lld/ELF/Driver.cpp
858–903	A quick skim of the area shows `//` is used for comenting function-level comments.
886
lld/ELF/InputFiles.h
24	Do you need to add this include? My understanding of the style guide is that you don't need to add it, unless the code doesn't compile without (it is likely pulled in by other headers).
llvm/tools/llvm-readobj/ELFDumper.cpp
6705–6706	Style is overwhelmingly `//` for functions here. These don't need doxygen comments, since they're not part of a public interface.
6733	clang-format
6739–6741	Test case needed for this path.

addressing comments

In D105217#2863821, @jhenderson wrote:

Test case(s) for the LLD portion of this patch?

There doesn't seem to be any tests for LLD, even for old implementation. What would be best way? feed llvm bit code in to llvm-mc, run lld on it and check final binary symbol table for ordering? Add something to LLD to output from internal data structure?

lld/ELF/InputFiles.h
24	ugh, sneaked in somehow.

missed an update

In D105217#2865533, @ayermolo wrote:

In D105217#2863821, @jhenderson wrote:

Test case(s) for the LLD portion of this patch?

There doesn't seem to be any tests for LLD, even for old implementation. What would be best way? feed llvm bit code in to llvm-mc, run lld on it and check final binary symbol table for ordering? Add something to LLD to output from internal data structure?

See test/ELF/cgprofile-obj.s for .cg_profile directives. The old implementation only tested RELA, not REL. The current state is only REL. Now we need RELA for GNU objcopy/strip compatibility, we need yaml2obj tests because assemblers don't emit RELA.

Please mark resolved comments as "done" (press submit).

In D105217#2865544, @MaskRay wrote:

In D105217#2865533, @ayermolo wrote:

In D105217#2863821, @jhenderson wrote:

Test case(s) for the LLD portion of this patch?

There doesn't seem to be any tests for LLD, even for old implementation. What would be best way? feed llvm bit code in to llvm-mc, run lld on it and check final binary symbol table for ordering? Add something to LLD to output from internal data structure?

See test/ELF/cgprofile-obj.s for .cg_profile directives. The old implementation only tested RELA, not REL. The current state is only REL. Now we need RELA for GNU objcopy/strip compatibility, we need yaml2obj tests because assemblers don't emit RELA.

Please mark resolved comments as "done" (press submit).

Ah I see it. Not sure how I missed it.

ayermolo marked 24 inline comments as done.Jul 8 2021, 2:25 PM

Harbormaster completed remote builds in B113087: Diff 357358.Jul 8 2021, 3:35 PM

Added LLD test.

Harbormaster completed remote builds in B113121: Diff 357396.Jul 8 2021, 6:52 PM

jhenderson added inline comments.Jul 9 2021, 12:34 AM

lld/ELF/Driver.cpp
858–903
886	Ping? This hasn't been addressed...
lld/test/ELF/cgprofile-rela-obj.test
1 ↗	(On Diff #357396)	Add a comment (using '##' for comment markers) to this test to explain the purpose of this test. I think you can also drop `-obj` from the test name.
4 ↗	(On Diff #357396)	You don't need to specify the entry point explicitly. If you really want to suppress the warning, I'd just rename one of the symbols to `_start`.
8 ↗	(On Diff #357396)	Nit: add a blank line between the RUN and YAML blocks. I also have a personal preference for the flow of the test to be: # RUN commands # CHECK directives YAML
llvm/test/tools/llvm-readobj/ELF/call-graph-profile.test
209
307

Addressing comments

lld/ELF/Driver.cpp
886	Sorry missed it.
lld/test/ELF/cgprofile-rela-obj.test
4 ↗	(On Diff #357396)	Hmm. Even with that still got warning: cannot find entry symbol _start; defaulting to 0x201120 Was probably doing something wrong. Just left as Aa.

MaskRay added inline comments.Jul 9 2021, 12:49 PM

lld/test/ELF/cgprofile-rela-obj.test
4 ↗	(On Diff #357396)	`_start` needs to be non-local. You can find the pattern in other tests. .globl _start _start:

Added _start

@MaskRay Gave it another shot. Got it to work this time. I guess global symbol needs to be at the end of Symbols in yaml. Previously I was getting linker error.

MaskRay added inline comments.Jul 9 2021, 2:11 PM

lld/test/ELF/cgprofile-rela.test
33	Delete unused `AddressAlign: 0x10`
73	Add `Offset: 0x0`
121	Delete STT_SECTION symbols not referenced by relocations.

Harbormaster completed remote builds in B113272: Diff 357614.Jul 9 2021, 3:29 PM

cleaned up cgprofile-rela.text

Missed comment fix.

ayermolo marked an inline comment as done.Jul 9 2021, 4:40 PM

Harbormaster completed remote builds in B113320: Diff 357667.Jul 9 2021, 6:46 PM

jhenderson added inline comments.Jul 12 2021, 4:05 AM

lld/test/ELF/cgprofile-rela.test
2	I'd rewrite the comment like this: "Under some circumstances, GNU tools strip/objcopy change REL to RELA. Test that LLD can handle call graph profile data relocated with RELA relocations."
33	I assume your aim is for 2-byte sections? If so, you can replace these lines with `Size: 2`, which I think expresses the intent better.
64	I believe the `Link: .symtab` is implicit for relocation sections. You can remove it.
65	Similarly: I think the address align field has a default value for reloc sections. No need to explicitly specify it.
llvm/test/tools/llvm-readobj/ELF/call-graph-profile.test
209	Two typos.
307

Addressing rela test comments.

typos

ayermolo marked 5 inline comments as done.Jul 12 2021, 10:54 AM

MaskRay added inline comments.Jul 12 2021, 11:00 AM

lld/test/ELF/cgprofile-rela.test
2	You can attach https://sourceware.org/bugzilla/show_bug.cgi?id=28035 after "change REL to RELA"

Harbormaster completed remote builds in B113532: Diff 357989.Jul 12 2021, 11:31 AM

Added link to bugzilla to the test.

Harbormaster completed remote builds in B113572: Diff 358042.Jul 12 2021, 1:49 PM

MaskRay accepted this revision.Jul 12 2021, 6:41 PM

LGTM, except for one remaining typo fix.

llvm/test/tools/llvm-readobj/ELF/call-graph-profile.test
307	Ping this typo fix.

Missed a typo.

Thanks for review. Will commit later today.

In D105217#2874507, @ayermolo wrote:

Thanks for review. Will commit later today.

Please update the summary (and commit message) about the use scenario.

Harbormaster completed remote builds in B113778: Diff 358326.Jul 13 2021, 11:42 AM

ayermolo edited the summary of this revision. (Show Details)Jul 13 2021, 12:04 PM

Closed by commit rG24129fbc9aa0: [LLD] Adding support for RELA for CG Profile. (authored by ayermolo). · Explain WhyJul 13 2021, 1:56 PM

This revision was automatically updated to reflect the committed changes.

ayermolo added a commit: rG24129fbc9aa0: [LLD] Adding support for RELA for CG Profile..

Revision Contents

Path

Size

lld/

ELF/

Driver.cpp

64 lines

InputFiles.h

6 lines

InputFiles.cpp

14 lines

test/

ELF/

cgprofile-rela.test

117 lines

llvm/

test/

tools/

llvm-readobj/

ELF/

call-graph-profile.test

133 lines

tools/

llvm-readobj/

ELFDumper.cpp

85 lines

Diff 358420

lld/ELF/Driver.cpp

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines

#include "llvm/Support/GlobPattern.h" #include "llvm/Support/GlobPattern.h"

#include "llvm/Support/LEB128.h" #include "llvm/Support/LEB128.h"

#include "llvm/Support/Parallel.h" #include "llvm/Support/Parallel.h"

#include "llvm/Support/Path.h" #include "llvm/Support/Path.h"

#include "llvm/Support/TarWriter.h" #include "llvm/Support/TarWriter.h"

#include "llvm/Support/TargetSelect.h" #include "llvm/Support/TargetSelect.h"

#include "llvm/Support/TimeProfiler.h" #include "llvm/Support/TimeProfiler.h"

#include "llvm/Support/raw_ostream.h" #include "llvm/Support/raw_ostream.h"

#include <cstdlib> #include <cstdlib>

modimoUnsubmitted

Done

Needed?

modimo: Needed?

#include <utility> #include <utility>

using namespace llvm; using namespace llvm;

using namespace llvm::ELF; using namespace llvm::ELF;

using namespace llvm::object; using namespace llvm::object;

using namespace llvm::sys; using namespace llvm::sys;

using namespace llvm::support; using namespace llvm::support;

using namespace lld; using namespace lld;

▲ Show 20 Lines • Show All 778 Lines • ▼ Show 20 Lines for (StringRef line : args::getLines(mb)) {

} }

if (InputSectionBase *from = findSection(fields[0])) if (InputSectionBase *from = findSection(fields[0]))

if (InputSectionBase *to = findSection(fields[1])) if (InputSectionBase *to = findSection(fields[1]))

config->callGraphProfile[std::make_pair(from, to)] += count; config->callGraphProfile[std::make_pair(from, to)] += count;

} }

template <class ELFT> static void readCallGraphsFromObjectFiles() { // If SHT_LLVM_CALL_GRAPH_PROFILE and its relocation section exist, returns

auto getIndex = [&](ObjFile<ELFT> *obj, uint32_t index) { // true and populates cgProfile and symbolIndices.

const Elf_Rel_Impl<ELFT, false> &rel = obj->cgProfileRel[index]; template <class ELFT>

return rel.getSymbol(config->isMips64EL); static bool

}; processCallGraphRelocations(SmallVector<uint32_t, 32> &symbolIndices,

ArrayRef<typename ELFT::CGProfile> &cgProfile,

ObjFile<ELFT> *inputObj) {

MaskRayUnsubmitted

Done

drop braces

MaskRay: drop braces

MaskRayUnsubmitted

Done

Not addressed, but see below: with inlining you can ignore this.

MaskRay: Not addressed, but see below: with inlining you can ignore this.

symbolIndices.clear();

const ELFFile<ELFT> &obj = inputObj->getObj();

ArrayRef<Elf_Shdr_Impl<ELFT>> objSections =

CHECK(obj.sections(), "could not retrieve object sections");

if (inputObj->cgProfileSectionIndex == SHN_UNDEF)

return false;

modimoUnsubmitted

Done

I don't think you need this vector, the getIndex lambda can be generalized as a templated lambda/function based on ArrayRef

modimo: I don't think you need this vector, the getIndex lambda can be generalized as a templated…

ayermoloAuthorUnsubmitted

Done

I wanted to doing a rel/rela check on each invocation of helper function.

ayermolo: I wanted to doing a rel/rela check on each invocation of helper function.

cgProfile =

check(obj.template getSectionContentsAsArray<typename ELFT::CGProfile>(

objSections[inputObj->cgProfileSectionIndex]));

MaskRayUnsubmitted

Done

section index

MaskRay: section index

for (size_t i = 0, e = objSections.size(); i < e; ++i) {

const Elf_Shdr_Impl<ELFT> &sec = objSections[i];

if (sec.sh_info == inputObj->cgProfileSectionIndex) {

if (sec.sh_type == SHT_RELA) {

ArrayRef<typename ELFT::Rela> relas =

CHECK(obj.relas(sec), "could not retrieve cg profile rela section");

for (const typename ELFT::Rela &rel : relas)

symbolIndices.push_back(rel.getSymbol(config->isMips64EL));

break;

}

MaskRayUnsubmitted

Done

< => !=

MaskRay: `<` => `!=`

jhendersonUnsubmitted

Done

break;

- } else if (sec.sh_type == SHT_REL) {

+ if (sec.sh_type == SHT_REL) {

ArrayRef<typename ELFT::Rel> rels =

jhenderson:

jhendersonUnsubmitted

Done

Ping? This hasn't been addressed...

jhenderson: Ping? This hasn't been addressed...

ayermoloAuthorUnsubmitted

Done

Sorry missed it.

ayermolo: Sorry missed it.

if (sec.sh_type == SHT_REL) {

ArrayRef<typename ELFT::Rel> rels =

CHECK(obj.rels(sec), "could not retrieve cg profile rel section");

for (const typename ELFT::Rel &rel : rels)

symbolIndices.push_back(rel.getSymbol(config->isMips64EL));

MaskRayUnsubmitted

Done

just inline getIndices here

MaskRay: just inline `getIndices` here

break;

}

if (symbolIndices.empty())

warn("SHT_LLVM_CALL_GRAPH_PROFILE exists, but relocation section doesn't");

return !symbolIndices.empty();

}

template <class ELFT> static void readCallGraphsFromObjectFiles() {

MaskRayUnsubmitted

Done

This consumes too much stack space. Perhaps SmallVector<uint32_t, 32>

lld already uses SmallVector<unsigned, 32> so using 32 will not increase binary size.

MaskRay: This consumes too much stack space. Perhaps `SmallVector<uint32_t, 32>` lld already uses…

SmallVector<uint32_t, 32> symbolIndices;

ArrayRef<typename ELFT::CGProfile> cgProfile;

jhendersonUnsubmitted

Done

}

- /// If SHT_LLVM_CALL_GRAPH_PROFILE and it's relocation section exists returns

- /// true, and populates cgProfile and symbolIndices.

+ // If SHT_LLVM_CALL_GRAPH_PROFILE and its relocation section exist, returns true

+ // and populates cgProfile and symbolIndices.

template <class ELFT>

A quick skim of the area shows // is used for comenting function-level comments.

jhenderson: A quick skim of the area shows `//` is used for comenting function-level comments.

jhendersonUnsubmitted

Done

}

- // If SHT_LLVM_CALL_GRAPH_PROFILE and it's relocation section exists returns

- // true, and populates cgProfile and symbolIndices.

+ // If SHT_LLVM_CALL_GRAPH_PROFILE and its relocation section exist, returns

+ // true and populates cgProfile and symbolIndices.

template <class ELFT>

jhenderson:

for (auto file : objectFiles) { for (auto file : objectFiles) {

auto *obj = cast<ObjFile<ELFT>>(file); auto *obj = cast<ObjFile<ELFT>>(file);

if (obj->cgProfileRel.empty()) if (!processCallGraphRelocations(symbolIndices, cgProfile, obj))

modimoUnsubmitted

Done

Add a warning + bailout if a binary every has both for some weird reason?

modimo: Add a warning + bailout if a binary every has both for some weird reason?

MaskRayUnsubmitted

Done

Delete the unneeded blank line

MaskRay: Delete the unneeded blank line

continue; continue;

if (obj->cgProfileRel.size() != obj->cgProfile.size() * 2)

if (symbolIndices.size() != cgProfile.size() * 2)

fatal("number of relocations doesn't match Weights"); fatal("number of relocations doesn't match Weights");

for (uint32_t i = 0, size = obj->cgProfile.size(); i < size; ++i) {

const Elf_CGProfile_Impl<ELFT> &cgpe = obj->cgProfile[i]; for (uint32_t i = 0, size = cgProfile.size(); i < size; ++i) {

uint32_t fromIndex = getIndex(obj, i * 2); const Elf_CGProfile_Impl<ELFT> &cgpe = cgProfile[i];

uint32_t toIndex = getIndex(obj, i * 2 + 1); uint32_t fromIndex = symbolIndices[i * 2];

uint32_t toIndex = symbolIndices[i * 2 + 1];

auto *fromSym = dyn_cast<Defined>(&obj->getSymbol(fromIndex)); auto *fromSym = dyn_cast<Defined>(&obj->getSymbol(fromIndex));

auto *toSym = dyn_cast<Defined>(&obj->getSymbol(toIndex)); auto *toSym = dyn_cast<Defined>(&obj->getSymbol(toIndex));

if (!fromSym || !toSym) if (!fromSym || !toSym)

continue; continue;

auto *from = dyn_cast_or_null<InputSectionBase>(fromSym->section); auto *from = dyn_cast_or_null<InputSectionBase>(fromSym->section);

auto *to = dyn_cast_or_null<InputSectionBase>(toSym->section); auto *to = dyn_cast_or_null<InputSectionBase>(toSym->section);

if (from && to) if (from && to)

▲ Show 20 Lines • Show All 1,560 Lines • Show Last 20 Lines

lld/ELF/InputFiles.h

Show All 15 Lines
#include "llvm/ADT/CachedHashString.h"		#include "llvm/ADT/CachedHashString.h"
#include "llvm/ADT/DenseSet.h"		#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/IR/Comdat.h"		#include "llvm/IR/Comdat.h"
#include "llvm/Object/Archive.h"		#include "llvm/Object/Archive.h"
#include "llvm/Object/ELF.h"		#include "llvm/Object/ELF.h"
#include "llvm/Object/IRObjectFile.h"		#include "llvm/Object/IRObjectFile.h"
#include "llvm/Support/Threading.h"		#include "llvm/Support/Threading.h"
#include <map>		#include <map>
		jhendersonUnsubmitted Done Reply Inline Actions Do you need to add this include? My understanding of the style guide is that you don't need to add it, unless the code doesn't compile without (it is likely pulled in by other headers). jhenderson: Do you need to add this include? My understanding of the style guide is that you don't need to…
		ayermoloAuthorUnsubmitted Done Reply Inline Actions ugh, sneaked in somehow. ayermolo: ugh, sneaked in somehow.

namespace llvm {		namespace llvm {
struct DILineInfo;		struct DILineInfo;
class TarWriter;		class TarWriter;
namespace lto {		namespace lto {
class InputFile;		class InputFile;
}		}
} // namespace llvm		} // namespace llvm
▲ Show 20 Lines • Show All 210 Lines • ▼ Show 20 Lines	public:
bool splitStack = false;		bool splitStack = false;

// True if the file defines functions compiled with -fsplit-stack,		// True if the file defines functions compiled with -fsplit-stack,
// but had one or more functions with the no_split_stack attribute.		// but had one or more functions with the no_split_stack attribute.
bool someNoSplitStack = false;		bool someNoSplitStack = false;

// Pointer to this input file's .llvm_addrsig section, if it has one.		// Pointer to this input file's .llvm_addrsig section, if it has one.
const Elf_Shdr *addrsigSec = nullptr;		const Elf_Shdr *addrsigSec = nullptr;

// SHT_LLVM_CALL_GRAPH_PROFILE table.		// SHT_LLVM_CALL_GRAPH_PROFILE section index.
		modimoUnsubmitted Done Reply Inline Actions Fix up comments to reflect that there is "always" here now. modimo: Fix up comments to reflect that there is "always" here now.
ArrayRef<Elf_CGProfile> cgProfile;		uint32_t cgProfileSectionIndex = 0;
		MaskRayUnsubmitted Done Reply Inline Actions uint32_t MaskRay: uint32_t
// SHT_LLVM_CALL_GRAPH_PROFILE relocations, always in the REL format.
ArrayRef<Elf_Rel> cgProfileRel;

// Get cached DWARF information.		// Get cached DWARF information.
DWARFCache *getDwarf();		DWARFCache *getDwarf();

private:		private:
void initializeSections(bool ignoreComdats);		void initializeSections(bool ignoreComdats);
void initializeSymbols();		void initializeSymbols();
void initializeJustSymbols();		void initializeJustSymbols();
▲ Show 20 Lines • Show All 162 Lines • Show Last 20 Lines

lld/ELF/InputFiles.cpp

Show First 20 Lines • Show All 565 Lines • ▼ Show 20 Lines	void ObjFile<ELFT>::initializeSections(bool ignoreComdats) {

ArrayRef<Elf_Shdr> objSections = CHECK(obj.sections(), this);		ArrayRef<Elf_Shdr> objSections = CHECK(obj.sections(), this);
uint64_t size = objSections.size();		uint64_t size = objSections.size();
this->sections.resize(size);		this->sections.resize(size);
this->sectionStringTable =		this->sectionStringTable =
CHECK(obj.getSectionStringTable(objSections), this);		CHECK(obj.getSectionStringTable(objSections), this);

std::vector<ArrayRef<Elf_Word>> selectedGroups;		std::vector<ArrayRef<Elf_Word>> selectedGroups;
// SHT_LLVM_CALL_GRAPH_PROFILE Section Index.
size_t cgProfileSectionIndex = 0;

for (size_t i = 0, e = objSections.size(); i < e; ++i) {		for (size_t i = 0, e = objSections.size(); i < e; ++i) {
if (this->sections[i] == &InputSection::discarded)		if (this->sections[i] == &InputSection::discarded)
continue;		continue;
const Elf_Shdr &sec = objSections[i];		const Elf_Shdr &sec = objSections[i];

if (sec.sh_type == ELF::SHT_LLVM_CALL_GRAPH_PROFILE) {		if (sec.sh_type == ELF::SHT_LLVM_CALL_GRAPH_PROFILE)
		MaskRayUnsubmitted Done Reply Inline Actions Replace `cgProfile` with `cgProfileSectionIndex`. Let `processRelocationsCGSection` (which probably should be renamed to something else, e.g. `processCallGraphRelocations`) bail out if `cgProfileSectionIndex` is 0 (SHN_UNDEF). You can also delete one loop there. MaskRay: Replace `cgProfile` with `cgProfileSectionIndex`. Let `processRelocationsCGSection` (which…
cgProfile =
check(obj.template getSectionContentsAsArray<Elf_CGProfile>(sec));
cgProfileSectionIndex = i;		cgProfileSectionIndex = i;
}

// SHF_EXCLUDE'ed sections are discarded by the linker. However,		// SHF_EXCLUDE'ed sections are discarded by the linker. However,
// if -r is given, we'll let the final link discard such sections.		// if -r is given, we'll let the final link discard such sections.
// This is compatible with GNU.		// This is compatible with GNU.
if ((sec.sh_flags & SHF_EXCLUDE) && !config->relocatable) {		if ((sec.sh_flags & SHF_EXCLUDE) && !config->relocatable) {
if (sec.sh_type == SHT_LLVM_ADDRSIG) {		if (sec.sh_type == SHT_LLVM_ADDRSIG) {
// We ignore the address-significance table if we know that the object		// We ignore the address-significance table if we know that the object
// file was created by objcopy or ld -r. This is because these tools		// file was created by objcopy or ld -r. This is because these tools
▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	void ObjFile<ELFT>::initializeSections(bool ignoreComdats) {
// such cases, the relocation section would attempt to reference a target		// such cases, the relocation section would attempt to reference a target
// section that has not yet been created. For simplicity, delay creation of		// section that has not yet been created. For simplicity, delay creation of
// relocation sections until now.		// relocation sections until now.
for (size_t i = 0, e = objSections.size(); i < e; ++i) {		for (size_t i = 0, e = objSections.size(); i < e; ++i) {
if (this->sections[i] == &InputSection::discarded)		if (this->sections[i] == &InputSection::discarded)
continue;		continue;
const Elf_Shdr &sec = objSections[i];		const Elf_Shdr &sec = objSections[i];

if (sec.sh_type == SHT_REL \|\| sec.sh_type == SHT_RELA) {		if (sec.sh_type == SHT_REL \|\| sec.sh_type == SHT_RELA)
this->sections[i] = createInputSection(sec);		this->sections[i] = createInputSection(sec);
if (cgProfileSectionIndex && sec.sh_info == cgProfileSectionIndex) {
if (sec.sh_type == SHT_REL)
cgProfileRel = CHECK(getObj().rels(sec), this);
}
}

// A SHF_LINK_ORDER section with sh_link=0 is handled as if it did not have		// A SHF_LINK_ORDER section with sh_link=0 is handled as if it did not have
// the flag.		// the flag.
if (!(sec.sh_flags & SHF_LINK_ORDER) \|\| !sec.sh_link)		if (!(sec.sh_flags & SHF_LINK_ORDER) \|\| !sec.sh_link)
continue;		continue;

InputSectionBase *linkSec = nullptr;		InputSectionBase *linkSec = nullptr;
if (sec.sh_link < this->sections.size())		if (sec.sh_link < this->sections.size())
▲ Show 20 Lines • Show All 1,230 Lines • Show Last 20 Lines

lld/test/ELF/cgprofile-rela.test

This file was added.

				## Under some circumstances, GNU tools strip/objcopy change REL to RELA. https://sourceware.org/bugzilla/show_bug.cgi?id=28035
				## Test that LLD can handle call graph profile data relocated with RELA relocations.
				jhendersonUnsubmitted Done Reply Inline Actions I'd rewrite the comment like this: "Under some circumstances, GNU tools strip/objcopy change REL to RELA. Test that LLD can handle call graph profile data relocated with RELA relocations." jhenderson: I'd rewrite the comment like this: "Under some circumstances, GNU tools strip/objcopy change…
				MaskRayUnsubmitted Not Done Reply Inline Actions You can attach https://sourceware.org/bugzilla/show_bug.cgi?id=28035 after "change REL to RELA" MaskRay: You can attach https://sourceware.org/bugzilla/show_bug.cgi?id=28035 after "change REL to RELA"
				# REQUIRES: x86

				# RUN: yaml2obj %s -o %t.o
				# RUN: ld.lld %t.o -o %t
				# RUN: llvm-nm --no-sort %t \| FileCheck %s
				# RUN: ld.lld --no-call-graph-profile-sort %t.o -o %t
				# RUN: llvm-nm --no-sort %t \| FileCheck %s --check-prefix=NO-CG

				# CHECK: 0000000000201124 t D
				# CHECK: 0000000000201122 t C
				# CHECK: 0000000000201128 t B
				# CHECK: 0000000000201120 t A
				# CHECK: 0000000000201126 T _start

				# NO-CG: 0000000000201120 t D
				# NO-CG: 0000000000201122 t C
				# NO-CG: 0000000000201124 t B
				# NO-CG: 0000000000201126 t A
				# NO-CG: 0000000000201128 T _start

				--- !ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_REL
				Machine: EM_X86_64
				Sections:
				- Name: .text.D
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Size: 2
				MaskRayUnsubmitted Done Reply Inline Actions Delete unused `AddressAlign: 0x10` MaskRay: Delete unused `AddressAlign: 0x10`
				jhendersonUnsubmitted Done Reply Inline Actions I assume your aim is for 2-byte sections? If so, you can replace these lines with `Size: 2`, which I think expresses the intent better. jhenderson: I assume your aim is for 2-byte sections? If so, you can replace these lines with `Size: 2`…
				- Name: .text.C
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Size: 2
				- Name: .text.B
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Size: 2
				- Name: .text.A
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Size: 2
				- Name: .text._start
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Size: 2
				- Name: .llvm.call-graph-profile
				Type: SHT_LLVM_CALL_GRAPH_PROFILE
				Flags: [ SHF_EXCLUDE ]
				Link: .symtab
				AddressAlign: 0x1
				Entries:
				- Weight: 10
				- Weight: 10
				- Weight: 80
				- Weight: 40
				- Weight: 30
				- Weight: 90
				- Name: .rela.llvm.call-graph-profile
				Type: SHT_RELA
				Info: .llvm.call-graph-profile
				jhendersonUnsubmitted Done Reply Inline Actions I believe the `Link: .symtab` is implicit for relocation sections. You can remove it. jhenderson: I believe the `Link: .symtab` is implicit for relocation sections. You can remove it.
				Relocations:
				jhendersonUnsubmitted Done Reply Inline Actions Similarly: I think the address align field has a default value for reloc sections. No need to explicitly specify it. jhenderson: Similarly: I think the address align field has a default value for reloc sections. No need to…
				- Offset: 0x0
				Symbol: A
				Type: R_X86_64_NONE
				- Offset: 0x0
				Symbol: B
				Type: R_X86_64_NONE
				- Offset: 0x8
				Symbol: A
				MaskRayUnsubmitted Done Reply Inline Actions Add `Offset: 0x0` MaskRay: Add `Offset: 0x0`
				Type: R_X86_64_NONE
				- Offset: 0x8
				Symbol: B
				Type: R_X86_64_NONE
				- Offset: 0x10
				Symbol: _start
				Type: R_X86_64_NONE
				- Offset: 0x10
				Symbol: B
				Type: R_X86_64_NONE
				- Offset: 0x18
				Symbol: A
				Type: R_X86_64_NONE
				- Offset: 0x18
				Symbol: C
				Type: R_X86_64_NONE
				- Offset: 0x20
				Symbol: B
				Type: R_X86_64_NONE
				- Offset: 0x20
				Symbol: C
				Type: R_X86_64_NONE
				- Offset: 0x28
				Symbol: C
				Type: R_X86_64_NONE
				- Offset: 0x28
				Symbol: D
				Type: R_X86_64_NONE
				Symbols:
				- Name: D
				Type: STT_FUNC
				Section: .text.D
				- Name: C
				Type: STT_FUNC
				Section: .text.C
				- Name: B
				Type: STT_FUNC
				Section: .text.B
				- Name: A
				Type: STT_FUNC
				Section: .text.A
				- Name: _start
				Binding: STB_GLOBAL
				Section: .text._start
				MaskRayUnsubmitted Done Reply Inline Actions Delete STT_SECTION symbols not referenced by relocations. MaskRay: Delete STT_SECTION symbols not referenced by relocations.

llvm/test/tools/llvm-readobj/ELF/call-graph-profile.test

Show All 34 Lines Entries:

- Weight: 98 - Weight: 98

EntSize: [[ENTSIZE=<none>]] EntSize: [[ENTSIZE=<none>]]

- Name: .rel.llvm.call-graph-profile - Name: .rel.llvm.call-graph-profile

Type: SHT_REL Type: SHT_REL

Info: .llvm.call-graph-profile Info: .llvm.call-graph-profile

Relocations: Relocations:

- Symbol: foo - Symbol: foo

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x1 - Offset: 0x0

Symbol: bar Symbol: bar

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x2 - Offset: 0x8

Symbol: bar Symbol: bar

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x3 - Offset: 0x8

Symbol: foo Symbol: foo

Type: R_X86_64_NONE Type: R_X86_64_NONE

Symbols: Symbols:

- Name: foo - Name: foo

- Name: bar - Name: bar

## Check we report a warning when unable to get the content of the SHT_LLVM_CALL_GRAPH_PROFILE section. ## Check we report a warning when unable to get the content of the SHT_LLVM_CALL_GRAPH_PROFILE section.

# RUN: yaml2obj %s -DENTSIZE=0xF -o %t2.o # RUN: yaml2obj %s -DENTSIZE=0xF -o %t2.o

▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines Entries:

- Weight: 20 - Weight: 20

- Name: .rel.llvm.call-graph-profile - Name: .rel.llvm.call-graph-profile

Type: SHT_REL Type: SHT_REL

Info: .llvm.call-graph-profile Info: .llvm.call-graph-profile

Relocations: Relocations:

- Symbol: 1 - Symbol: 1

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x1 - Offset: 0x0

Symbol: 2 Symbol: 2

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x2 - Offset: 0x8

Symbol: 2 Symbol: 2

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x3 - Offset: 0x8

Symbol: 3 Symbol: 3

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x4 - Offset: 0x10

Symbol: 0x0 ## Null symbol. Symbol: 0x0 ## Null symbol.

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x5 - Offset: 0x10

Symbol: 0x4 ## This index goes past the end of the symbol table. Symbol: 0x4 ## This index goes past the end of the symbol table.

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Name: .strtab - Name: .strtab

Type: SHT_STRTAB Type: SHT_STRTAB

Content: "0041004200" ## '\0', 'A', '\0', 'B', '\0' Content: "0041004200" ## '\0', 'A', '\0', 'B', '\0'

Symbols: Symbols:

- StName: 1 ## 'A' - StName: 1 ## 'A'

- StName: 0xFF ## An arbitrary currupted index in the string table. - StName: 0xFF ## An arbitrary currupted index in the string table.

Show All 20 Lines FileHeader:

Data: ELFDATA2LSB Data: ELFDATA2LSB

Type: ET_DYN Type: ET_DYN

Sections: Sections:

- Name: .llvm.call-graph-profile - Name: .llvm.call-graph-profile

Type: SHT_LLVM_CALL_GRAPH_PROFILE Type: SHT_LLVM_CALL_GRAPH_PROFILE

Entries: Entries:

- Weight: 89 - Weight: 89

- Weight: 98 - Weight: 98

EntSize: [[ENTSIZE=<none>]]

Symbols: Symbols:

- Name: foo - Name: foo

- Name: bar - Name: bar

## Check we report a warning when the number of relocation section entries does not match the number of call graph entries. ## Check we report a warning when the number of relocation section entries does not match the number of call graph entries.

# RUN: yaml2obj %s --docnum=4 -o %t5.o # RUN: yaml2obj %s --docnum=4 -o %t5.o

# RUN: llvm-readobj %t5.o --cg-profile 2>&1 | FileCheck %s -DFILE=%t5.o --check-prefix=LLVM-RELOC-GRAPH-NOT-MATCH # RUN: llvm-readobj %t5.o --cg-profile 2>&1 | FileCheck %s -DFILE=%t5.o --check-prefix=LLVM-RELOC-GRAPH-NOT-MATCH

# RUN: llvm-readobj %t5.o --elf-cg-profile 2>&1 | FileCheck %s -DFILE=%t5.o --check-prefix=LLVM-RELOC-GRAPH-NOT-MATCH # RUN: llvm-readobj %t5.o --elf-cg-profile 2>&1 | FileCheck %s -DFILE=%t5.o --check-prefix=LLVM-RELOC-GRAPH-NOT-MATCH

Show All 15 Lines FileHeader:

Type: ET_DYN Type: ET_DYN

Machine: EM_X86_64 Machine: EM_X86_64

Sections: Sections:

- Name: .llvm.call-graph-profile - Name: .llvm.call-graph-profile

Type: SHT_LLVM_CALL_GRAPH_PROFILE Type: SHT_LLVM_CALL_GRAPH_PROFILE

Entries: Entries:

- Weight: 89 - Weight: 89

- Weight: 98 - Weight: 98

EntSize: [[ENTSIZE=<none>]]

- Name: .rel.llvm.call-graph-profile - Name: .rel.llvm.call-graph-profile

Type: SHT_REL Type: SHT_REL

Info: .llvm.call-graph-profile Info: .llvm.call-graph-profile

Relocations: Relocations:

- Symbol: foo - Symbol: foo

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x1 - Offset: 0x0

Symbol: bar Symbol: bar

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x2 - Offset: 0x8

Symbol: bar Symbol: bar

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x3 - Offset: 0x8

Symbol: foo Symbol: foo

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x4 - Offset: 0x10

Symbol: foo Symbol: foo

Type: R_X86_64_NONE Type: R_X86_64_NONE

Symbols: Symbols:

- Name: foo - Name: foo

- Name: bar - Name: bar

## Check we report a warning when a relocation section cant't be loaded. ## Check we report a warning when a REL relocation section can't be loaded.

jhendersonUnsubmitted

Done

- Name: bar

- ## Check we report a warning when a relocation, REL, section cant't be loaded.

+ ## Check we report a warning when a REL relocation section cant't be loaded.

# RUN: yaml2obj %s --docnum=5 -o %t6.o

jhenderson:

jhendersonUnsubmitted

Done

- Name: bar

- ## Check we report a warning when a REl relocation section cant't be loaded.

+ ## Check we report a warning when a REL relocation section can't be loaded.

# RUN: yaml2obj %s --docnum=5 -o %t6.o

Two typos.

jhenderson: Two typos.

# RUN: yaml2obj %s --docnum=5 -o %t6.o # RUN: yaml2obj %s --docnum=5 -o %t6.o

# RUN: llvm-readobj %t6.o --cg-profile 2>&1 | FileCheck %s -DFILE=%t6.o --check-prefix=LLVM-RELOC-WRONG-SIZE # RUN: llvm-readobj %t6.o --cg-profile 2>&1 | FileCheck %s -DFILE=%t6.o --check-prefix=LLVM-RELOC-WRONG-SIZE

# RUN: llvm-readobj %t6.o --elf-cg-profile 2>&1 | FileCheck %s -DFILE=%t6.o --check-prefix=LLVM-RELOC-WRONG-SIZE # RUN: llvm-readobj %t6.o --elf-cg-profile 2>&1 | FileCheck %s -DFILE=%t6.o --check-prefix=LLVM-RELOC-WRONG-SIZE

# LLVM-RELOC-WRONG-SIZE: warning: '[[FILE]]': unable to load relocations for SHT_LLVM_CALL_GRAPH_PROFILE section: section [index 2] has invalid sh_entsize: expected 16, but got 24 # LLVM-RELOC-WRONG-SIZE: warning: '[[FILE]]': unable to load relocations for SHT_LLVM_CALL_GRAPH_PROFILE section: section [index 2] has invalid sh_entsize: expected 16, but got 24

# LLVM-RELOC-WRONG-SIZE-NEXT: CGProfile [ # LLVM-RELOC-WRONG-SIZE-NEXT: CGProfile [

# LLVM-RELOC-WRONG-SIZE-NEXT: CGProfileEntry { # LLVM-RELOC-WRONG-SIZE-NEXT: CGProfileEntry {

# LLVM-RELOC-WRONG-SIZE-NEXT: Weight: 89 # LLVM-RELOC-WRONG-SIZE-NEXT: Weight: 89

Show All 10 Lines FileHeader:

Type: ET_DYN Type: ET_DYN

Machine: EM_X86_64 Machine: EM_X86_64

Sections: Sections:

- Name: .llvm.call-graph-profile - Name: .llvm.call-graph-profile

Type: SHT_LLVM_CALL_GRAPH_PROFILE Type: SHT_LLVM_CALL_GRAPH_PROFILE

Entries: Entries:

- Weight: 89 - Weight: 89

- Weight: 98 - Weight: 98

EntSize: [[ENTSIZE=<none>]]

- Name: .rel.llvm.call-graph-profile - Name: .rel.llvm.call-graph-profile

Type: SHT_REL Type: SHT_REL

Info: .llvm.call-graph-profile Info: .llvm.call-graph-profile

Relocations: Relocations:

- Symbol: foo - Symbol: foo

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x1 - Offset: 0x0

Symbol: bar Symbol: bar

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x2 - Offset: 0x8

Symbol: bar Symbol: bar

Type: R_X86_64_NONE Type: R_X86_64_NONE

- Offset: 0x3 - Offset: 0x8

Symbol: foo Symbol: foo

Type: R_X86_64_NONE Type: R_X86_64_NONE

EntSize: 24 EntSize: 24

Symbols: Symbols:

- Name: foo - Name: foo

- Name: bar - Name: bar

## GNU strip may convert SHT_REL to SHT_RELA. Test we can handle SHT_RELA.

MaskRayUnsubmitted

Done

Add a test comment: ## GNU strip may convert SHT_REL to SHT_RELA. Test we can handle SHT_RELA.

MaskRay: Add a test comment: `## GNU strip may convert SHT_REL to SHT_RELA. Test we can handle SHT_RELA.`

# RUN: yaml2obj %s --docnum=6 -o %t7.o

# RUN: llvm-readobj %t7.o --cg-profile | FileCheck %s --check-prefix=LLVM-RELA

# RUN: llvm-readelf %t7.o --cg-profile | FileCheck %s --check-prefix=GNU-RELA

MaskRayUnsubmitted

Done

Delete elf-cg-profile RUN lines. The aliases are tested by dedicated tests, no need for duplicating.

MaskRay: Delete elf-cg-profile RUN lines. The aliases are tested by dedicated tests, no need for…

# LLVM-RELA: CGProfile [

# LLVM-RELA-NEXT: CGProfileEntry {

# LLVM-RELA-NEXT: From: foo (1)

# LLVM-RELA-NEXT: To: bar (2)

# LLVM-RELA-NEXT: Weight: 89

# LLVM-RELA-NEXT: }

# LLVM-RELA-NEXT: CGProfileEntry {

# LLVM-RELA-NEXT: From: bar (2)

# LLVM-RELA-NEXT: To: foo (1)

# LLVM-RELA-NEXT: Weight: 98

# LLVM-RELA-NEXT: }

# LLVM-RELA-NEXT: ]

# GNU-RELA: GNUStyle::printCGProfile not implemented

--- !ELF

FileHeader:

Class: ELFCLASS64

Data: ELFDATA2LSB

Type: ET_DYN

Machine: EM_X86_64

Sections:

- Name: .llvm.call-graph-profile

Type: SHT_LLVM_CALL_GRAPH_PROFILE

Entries:

- Weight: 89

- Weight: 98

- Name: .rela.llvm.call-graph-profile

Type: SHT_RELA

MaskRayUnsubmitted

Done

Delete if unused or make it match the reality

MaskRay: Delete if unused or make it match the reality

Info: .llvm.call-graph-profile

Relocations:

- Symbol: foo

Type: R_X86_64_NONE

- Offset: 0x0

Symbol: bar

Type: R_X86_64_NONE

MaskRayUnsubmitted

Done

The offsets should match the reality: they relocate the values (0x0, 0x8, 0x10, ...)

MaskRay: The offsets should match the reality: they relocate the values (0x0, 0x8, 0x10, ...)

- Offset: 0x8

Symbol: bar

Type: R_X86_64_NONE

- Offset: 0x8

Symbol: foo

Type: R_X86_64_NONE

Symbols:

- Name: foo

- Name: bar

## Check we report a warning when a RELA relocation section can't be loaded.

jhendersonUnsubmitted

Done

- Name: bar

- ## Check we report a warning when a relocation, RELA, section cant't be loaded.

+ ## Check we report a warning when a RELA relocation section cant't be loaded.

# RUN: yaml2obj %s --docnum=7 -o %t8.o

jhenderson:

jhendersonUnsubmitted

Done

- Name: bar

- ## Check we report a warning when a RELA relocation section cant't be loaded.

+ ## Check we report a warning when a RELA relocation section can't be loaded.

# RUN: yaml2obj %s --docnum=7 -o %t8.o

jhenderson:

jhendersonUnsubmitted

Done

Ping this typo fix.

jhenderson: Ping this typo fix.

# RUN: yaml2obj %s --docnum=7 -o %t8.o

# RUN: llvm-readobj %t8.o --cg-profile 2>&1 | FileCheck %s -DFILE=%t8.o --check-prefix=LLVM-RELOC-WRONG-SIZE-RELA

# RUN: llvm-readobj %t8.o --elf-cg-profile 2>&1 | FileCheck %s -DFILE=%t8.o --check-prefix=LLVM-RELOC-WRONG-SIZE-RELA

# LLVM-RELOC-WRONG-SIZE-RELA: warning: '[[FILE]]': unable to load relocations for SHT_LLVM_CALL_GRAPH_PROFILE section: section [index 2] has invalid sh_entsize: expected 24, but got 16

# LLVM-RELOC-WRONG-SIZE-RELA-NEXT: CGProfile [

# LLVM-RELOC-WRONG-SIZE-RELA-NEXT: CGProfileEntry {

# LLVM-RELOC-WRONG-SIZE-RELA-NEXT: Weight: 89

# LLVM-RELOC-WRONG-SIZE-RELA-NEXT: }

# LLVM-RELOC-WRONG-SIZE-RELA-NEXT: CGProfileEntry {

# LLVM-RELOC-WRONG-SIZE-RELA-NEXT: Weight: 98

# LLVM-RELOC-WRONG-SIZE-RELA-NEXT: }

# LLVM-RELOC-WRONG-SIZE-RELA-NEXT: ]

--- !ELF

FileHeader:

Class: ELFCLASS64

Data: ELFDATA2LSB

Type: ET_DYN

Machine: EM_X86_64

Sections:

- Name: .llvm.call-graph-profile

Type: SHT_LLVM_CALL_GRAPH_PROFILE

Entries:

- Weight: 89

- Weight: 98

- Name: .rela.llvm.call-graph-profile

Type: SHT_RELA

Info: .llvm.call-graph-profile

Relocations:

- Symbol: foo

Type: R_X86_64_NONE

- Offset: 0x0

Symbol: bar

Type: R_X86_64_NONE

- Offset: 0x8

Symbol: bar

Type: R_X86_64_NONE

- Offset: 0x8

Symbol: foo

Type: R_X86_64_NONE

EntSize: 16

Symbols:

- Name: foo

- Name: bar

llvm/tools/llvm-readobj/ELFDumper.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,696 Lines • ▼ Show 20 Lines	for (const VerNeed &VN : *V) {
}		}
}		}
}		}

template <class ELFT> void LLVMELFDumper<ELFT>::printHashHistograms() {		template <class ELFT> void LLVMELFDumper<ELFT>::printHashHistograms() {
W.startLine() << "Hash Histogram not implemented!\n";		W.startLine() << "Hash Histogram not implemented!\n";
}		}

		// Returns true if rel/rela section exists, and populates SymbolIndices.
		// Otherwise returns false.
		jhendersonUnsubmitted Done Reply Inline Actions Style is overwhelmingly `//` for functions here. These don't need doxygen comments, since they're not part of a public interface. jhenderson: Style is overwhelmingly `//` for functions here. These don't need doxygen comments, since…
		template <class ELFT>
		static bool getSymbolIndices(const typename ELFT::Shdr *CGRelSection,
		const ELFFile<ELFT> &Obj,
		const LLVMELFDumper<ELFT> *Dumper,
		SmallVector<uint32_t, 128> &SymbolIndices) {
		if (!CGRelSection) {
		Dumper->reportUniqueWarning(
		"relocation section for a call graph section doesn't exist");
		return false;
		}

		if (CGRelSection->sh_type == SHT_REL) {
		typename ELFT::RelRange CGProfileRel;
		Expected<typename ELFT::RelRange> CGProfileRelOrError =
		Obj.rels(*CGRelSection);
		if (!CGProfileRelOrError) {
		Dumper->reportUniqueWarning("unable to load relocations for "
		"SHT_LLVM_CALL_GRAPH_PROFILE section: " +
		toString(CGProfileRelOrError.takeError()));
		return false;
		}
		MaskRayUnsubmitted Done Reply Inline Actions Drop `else` since we are using the early return pattern. MaskRay: Drop `else` since we are using the early return pattern.

		CGProfileRel = *CGProfileRelOrError;
		for (const typename ELFT::Rel &Rel : CGProfileRel)
		SymbolIndices.push_back(Rel.getSymbol(Obj.isMips64EL()));
		} else {
		// MC unconditionally produces SHT_REL, but GNU strip/objcopy may convert
		MaskRayUnsubmitted Done Reply Inline Actions Add a comment: MC unconditionally produces SHT_REL but GNU strip may convert the format to SHT_RELA (https://sourceware.org/bugzilla/show_bug.cgi?id=28035) I don't expect they will fix this any time soon, but good to have a reference to justify additional complexity here. MaskRay: Add a comment: MC unconditionally produces SHT_REL but GNU strip may convert the format to…
		jhendersonUnsubmitted Done Reply Inline Actions clang-format jhenderson: clang-format
		// the format to SHT_RELA
		// (https://sourceware.org/bugzilla/show_bug.cgi?id=28035)
		typename ELFT::RelaRange CGProfileRela;
		Expected<typename ELFT::RelaRange> CGProfileRelaOrError =
		Obj.relas(*CGRelSection);
		if (!CGProfileRelaOrError) {
		Dumper->reportUniqueWarning("unable to load relocations for "
		"SHT_LLVM_CALL_GRAPH_PROFILE section: " +
		jhendersonUnsubmitted Done Reply Inline Actions Test case needed for this path. jhenderson: Test case needed for this path.
		toString(CGProfileRelaOrError.takeError()));
		return false;
		}

		CGProfileRela = *CGProfileRelaOrError;
		for (const typename ELFT::Rela &Rela : CGProfileRela)
		SymbolIndices.push_back(Rela.getSymbol(Obj.isMips64EL()));
		}

		return true;
		}

template <class ELFT> void LLVMELFDumper<ELFT>::printCGProfile() {		template <class ELFT> void LLVMELFDumper<ELFT>::printCGProfile() {
llvm::MapVector<const Elf_Shdr , const Elf_Shdr > SecToRelocMap;		llvm::MapVector<const Elf_Shdr , const Elf_Shdr > SecToRelocMap;

auto IsMatch = [](const Elf_Shdr &Sec) -> bool {		auto IsMatch = [](const Elf_Shdr &Sec) -> bool {
return Sec.sh_type == ELF::SHT_LLVM_CALL_GRAPH_PROFILE;		return Sec.sh_type == ELF::SHT_LLVM_CALL_GRAPH_PROFILE;
};		};
this->getSectionAndRelocations(IsMatch, SecToRelocMap);		this->getSectionAndRelocations(IsMatch, SecToRelocMap);

for (const auto &CGMapEntry : SecToRelocMap) {		for (const auto &CGMapEntry : SecToRelocMap) {
const Elf_Shdr *CGSection = CGMapEntry.first;		const Elf_Shdr *CGSection = CGMapEntry.first;
const Elf_Shdr *CGRelSection = CGMapEntry.second;		const Elf_Shdr *CGRelSection = CGMapEntry.second;

Expected<ArrayRef<Elf_CGProfile>> CGProfileOrErr =		Expected<ArrayRef<Elf_CGProfile>> CGProfileOrErr =
this->Obj.template getSectionContentsAsArray<Elf_CGProfile>(*CGSection);		this->Obj.template getSectionContentsAsArray<Elf_CGProfile>(*CGSection);
if (!CGProfileOrErr) {		if (!CGProfileOrErr) {
this->reportUniqueWarning(		this->reportUniqueWarning(
"unable to load the SHT_LLVM_CALL_GRAPH_PROFILE section: " +		"unable to load the SHT_LLVM_CALL_GRAPH_PROFILE section: " +
toString(CGProfileOrErr.takeError()));		toString(CGProfileOrErr.takeError()));
return;		return;
}		}

Elf_Rel_Range CGProfileRel;		SmallVector<uint32_t, 128> SymbolIndices;
bool UseReloc = (CGRelSection != nullptr);		bool UseReloc =
if (UseReloc) {		getSymbolIndices<ELFT>(CGRelSection, this->Obj, this, SymbolIndices);
Expected<Elf_Rel_Range> CGProfileRelaOrError =		if (UseReloc && SymbolIndices.size() != CGProfileOrErr->size() * 2) {
		MaskRayUnsubmitted Done Reply Inline Actions drop parens beside `!=` MaskRay: drop parens beside `!=`
this->Obj.rels(*CGRelSection);
if (!CGProfileRelaOrError) {
this->reportUniqueWarning("unable to load relocations for "
"SHT_LLVM_CALL_GRAPH_PROFILE section: " +
toString(CGProfileRelaOrError.takeError()));
UseReloc = false;
} else
CGProfileRel = *CGProfileRelaOrError;

if (UseReloc && CGProfileRel.size() != (CGProfileOrErr->size() * 2)) {
this->reportUniqueWarning(		this->reportUniqueWarning(
"number of from/to pairs does not match number of frequencies");		"number of from/to pairs does not match number of frequencies");
UseReloc = false;		UseReloc = false;
}		}
} else
this->reportUniqueWarning(
"relocation section for a call graph section doesn't exist");

auto GetIndex = [&](uint32_t Index) {
const Elf_Rel_Impl<ELFT, false> &Rel = CGProfileRel[Index];
return Rel.getSymbol(this->Obj.isMips64EL());
};

ListScope L(W, "CGProfile");		ListScope L(W, "CGProfile");
for (uint32_t I = 0, Size = CGProfileOrErr->size(); I != Size; ++I) {		for (uint32_t I = 0, Size = CGProfileOrErr->size(); I != Size; ++I) {
const Elf_CGProfile &CGPE = (*CGProfileOrErr)[I];		const Elf_CGProfile &CGPE = (*CGProfileOrErr)[I];
DictScope D(W, "CGProfileEntry");		DictScope D(W, "CGProfileEntry");
if (UseReloc) {		if (UseReloc) {
uint32_t From = GetIndex(I * 2);		uint32_t From = SymbolIndices[I * 2];
uint32_t To = GetIndex(I * 2 + 1);		uint32_t To = SymbolIndices[I * 2 + 1];
W.printNumber("From", this->getStaticSymbolName(From), From);		W.printNumber("From", this->getStaticSymbolName(From), From);
W.printNumber("To", this->getStaticSymbolName(To), To);		W.printNumber("To", this->getStaticSymbolName(To), To);
}		}
W.printNumber("Weight", CGPE.cgp_weight);		W.printNumber("Weight", CGPE.cgp_weight);
}		}
}		}
}		}

▲ Show 20 Lines • Show All 391 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[LLD] Adding support for RELA for CG Profile.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 358420

lld/ELF/Driver.cpp

lld/ELF/InputFiles.h

lld/ELF/InputFiles.cpp

lld/test/ELF/cgprofile-rela.test

llvm/test/tools/llvm-readobj/ELF/call-graph-profile.test

llvm/tools/llvm-readobj/ELFDumper.cpp

[LLD] Adding support for RELA for CG Profile.
ClosedPublic