This is an archive of the discontinued LLVM Phabricator instance.

[BOLT] Update dynamic relocations from section relocations
ClosedPublic

Authored by yota9 on Jan 18 2022, 2:34 PM.

Details

Summary

This patch changes patchELFAllocatableRelaSections from going through
old relocations sections and update the relocation offsets to emitting
the relocations stored in binary sections. This is needed in case we
would like to remove and add dynamic relocations during BOLT work and it
is used by golang support pass. Note: Currently we emit relocations in
the old sections, so the total number of them should be equal or less
of old number.

Testing: No special tests are neeeded, since this patch does not fix
anything or add new functionality (it only prepares to add). Every
PIC-compiled test binary will use this code and thus become a test.
But just in case the aarch64 dynamic relocations tests were added.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Diff Detail

Event Timeline

yota9 created this revision.Jan 18 2022, 2:34 PM
yota9 requested review of this revision.Jan 18 2022, 2:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 18 2022, 2:34 PM
yota9 updated this revision to Diff 401777.Jan 20 2022, 2:05 PM

Replace isJmpRel() with IsJmpRel flag in relocation struct. Since some of the linkers are inconsistent (e.g. the LD places R_AARCH64_TLSDESC in .rela.plt and lld places it in .rela.dyn) the flag usage looks like more robust solution, then on-type decisions.

yota9 updated this revision to Diff 401782.Jan 20 2022, 2:14 PM
This comment was removed by yota9.
yota9 updated this revision to Diff 402265.Jan 22 2022, 3:01 PM

Add aarch64 dynamic relocations tests

yota9 edited the summary of this revision. (Show Details)Jan 22 2022, 3:04 PM
maksfb added inline comments.Jan 24 2022, 12:47 AM
bolt/include/bolt/Core/Relocation.h
54

Instead of introducing this flag, could we change the rewriter to classify relocation types based on the input?

bolt/lib/Rewrite/RewriteInstance.cpp
4816
4821
bolt/test/runtime/AArch64/Inputs/runtime_relocs.c
3
11–12
bolt/test/runtime/AArch64/runtime_relocs.c
10

Is it necessary to run the binary?

49
51
53
yota9 marked 7 inline comments as done.Jan 24 2022, 9:01 AM
yota9 added inline comments.
bolt/include/bolt/Core/Relocation.h
54

Sorry I'm not quite understand. Do you mean to classify it based on the relocation type? I did it in the first iteration of the patch, but as I said above we have a problem here: for example LD places R_AARCH64_TLSDESC in .rela.plt and lld places it in .rela.dyn, so there is not stable way to determine where did the relocation came from. Or you mean something else?

bolt/test/runtime/AArch64/runtime_relocs.c
10

Well, no, it is just extra check the the binary works fine. Is there any reason to create non-runnable tests? AFAIU the X86 test bot won't run aarch64 tests even if they are not runnable, or I'm wrong?

yota9 added inline comments.Jan 24 2022, 9:07 AM
bolt/include/bolt/Core/Relocation.h
54

Or you mean to create the map of REL_TYPE->section while reading the relocations? Well it might work (of course if for some reason we won't met the same rel type in both sections). Plus we will need to handle this structure explicitly in case the new relocations are created during bolt work..

maksfb added inline comments.Jan 24 2022, 10:06 AM
bolt/include/bolt/Core/Relocation.h
54

Exactly. Typically, only specific relocation types are classified as "jump", but if the input contains more we can extend the class.

yota9 updated this revision to Diff 402596.Jan 24 2022, 11:08 AM

Replace IsJmpRel field with std::map and resolve other comments

yota9 marked 2 inline comments as done.Jan 24 2022, 11:09 AM

@maksfb Thank you for you review! I've resolved most of the comments with the latest patch update :)

maksfb added inline comments.Jan 24 2022, 2:08 PM
bolt/include/bolt/Core/Relocation.h
51–53

I feel uneasy about overloading the Value field. An alternative will be to keep MCSymbol -> index map for dynamic symbols. What do you think?

bolt/include/bolt/Rewrite/RewriteInstance.h
448
449

DenseMap is a better choice, maybe even IndexedMap.

bolt/lib/Rewrite/RewriteInstance.cpp
2129

Does it make sense to pre-populate the map with types that are known to be jump relocations?

yota9 added inline comments.Jan 24 2022, 2:26 PM
bolt/include/bolt/Core/Relocation.h
51–53

Well to be honest I love this solution more.
Static relocations -> Extracted value
Dynamic relocations -> Symbol value
It seems logical for me, I'm not sure that we need this overcomplications with extra maps to be hones...

bolt/lib/Rewrite/RewriteInstance.cpp
2129

Well, currently I just don't see the reason to do this over using the values that came from the binary

yota9 updated this revision to Diff 402962.Jan 25 2022, 10:31 AM
yota9 marked 2 inline comments as done.

Replace std::map with DenseMap

bolt/include/bolt/Rewrite/RewriteInstance.h
449

You're right, DenseMap is a bit of a better choice here. IndexedMap will consume more memory as I remember, since relocation type number might be quite big.

maksfb added inline comments.Jan 25 2022, 11:54 AM
bolt/include/bolt/Core/Relocation.h
51–53

It might be a bit more code, but less hacky. The index of the symbol in the input/output dynamic symbol table, is not a property of a relocation object. Note that you actually need the index in the output file. When we decide to update/rewrite the dynamic symbol table, the index may change. I suggest we add an interface to the rewriter that will return an index for a given MCSymbol (it can use the map for the current implementation).

yota9 added inline comments.Jan 25 2022, 12:08 PM
bolt/include/bolt/Core/Relocation.h
51–53

Well, in case the symbol table will be updated one day we will need to create such a map anyway, you're right. I will do this :)

yota9 updated this revision to Diff 405027.Feb 1 2022, 11:57 AM

Add SymbolIndex map

yota9 marked 2 inline comments as done.Feb 1 2022, 11:59 AM
yota9 added inline comments.
bolt/include/bolt/Core/Relocation.h
51–53

Hello @maksfb ! I've added SymbolIndex map as we've discussed :)

yota9 updated this revision to Diff 405284.Feb 2 2022, 7:59 AM

clang-format

maksfb added inline comments.Feb 8 2022, 9:54 PM
bolt/include/bolt/Rewrite/RewriteInstance.h
451

Could you put a more accurate description? I.e. "index in the dynamic symbol table". Also, not all symbols will be in the table, right? Only those referenced by relocations?

452

Perhaps DenseMap here as well?

bolt/lib/Rewrite/RewriteInstance.cpp
2107

nit: there's no need for the variable as there's only one usage.

2108

nit: use RType defined above. You can make it const.

4797–4799

I'd rather put the code under getOutputDynamicSymbolIndex(MCSymbol *).

yota9 updated this revision to Diff 408427.Feb 14 2022, 7:59 AM
yota9 marked 4 inline comments as done.

Resolve comments

bolt/include/bolt/Rewrite/RewriteInstance.h
451

Done. Yes, currently only relocation-related symbols will appear there. To be honest I just didn't find the way to get all the symbols in table order, anyway currently it is not needed, I will add a NOTE about this.

452

In my mind unordered_map is good choice here, I've forgot to fix it before upload. AFAIK DenseMap keeps the data in one allocation, I don't really think it is needed here, there might be quite a lot of symbols, so I feel like many small allocations is more winning here. Also the fata locality won't really change the overall performance. But I don't really insist, if you'd like I will replace it with DenseMap

bolt/lib/Rewrite/RewriteInstance.cpp
2108

My bad, thanks

4797–4799

I was thinking about this too, but rewriteinstance lacks of such wrappers so I've decided to add it in place. But I don't mind, so done :)

It looks good. Thanks for the fixes.

In the summary: " so we the total number" -> "so the total number".

bolt/lib/Rewrite/RewriteInstance.cpp
4756

Can you use Section->getInputFileOffset()?

4781
4800
4809
yota9 updated this revision to Diff 408836.Feb 15 2022, 5:37 AM
yota9 marked 4 inline comments as done.

Address comments

yota9 added a comment.Feb 15 2022, 5:37 AM

@maksfb Thank you for your review and valuable comments! :)

bolt/lib/Rewrite/RewriteInstance.cpp
4756

Thanks!

There's a small typo left in the summary.

Regarding the testing, we need to make sure the relocation sections remain the same. This could be done via binary comparison, but unfortunately the order of relocations becomes different.

yota9 edited the summary of this revision. (Show Details)Feb 15 2022, 9:13 AM
yota9 updated this revision to Diff 408917.Feb 15 2022, 9:17 AM

@maksfb Strange, fixed now :) + clang-formatted one line.

Regarding the order question, yes, the reorder is somewhat expected. Do you have ideas how to test it?

yota9 added a comment.Feb 15 2022, 9:59 AM

@maksfb The relocations are usually sorted by the offset in the file. But I didn't sort them, since I don't see any real reason to do this, it doesn't affect the runtime. Although theoretically it might help with the testing, but it is time & space consuming to find and reorder all the relocations and store them in needed order, so I don't really think we need to do this..

At a couple cases I looked at, the relocations were unsorted before and are sorted now with this change.

@maksfb Interesting, usually I see that relocations are sorted initially. Now they will be sorted in storage order and per section order, which does not guarantee the offset order.. Anyway if you have any ideas about this please let me know :)

I can do manual testing on internal binaries. More scalable approach is to alter the binary comparison to allow a set of modifications that preserve binary semantics. Ordering relocations would be an example of such transformation. Naturally we can sort them for the comparison purposes.

@maksfb Thanks! We can discuss such approach one day if you'd like (out of scope of this review, I think), since to be honest right now I don't have deep understanding how to make it properly..

maksfb accepted this revision.Feb 15 2022, 10:04 PM

One way to do the binary comparison in this case, is to dump relocations (e.g. via llvm-objdump) from mismatched sections and sort them. But you are right, it's beyond the scope of this diff.

This revision is now accepted and ready to land.Feb 15 2022, 10:04 PM
yota9 added a comment.Feb 16 2022, 2:14 AM

Oh you mean by external script. I think it is possible to make this, however it might be a bit challenging with no-symbol (e.g. relative) relocations. But we could always find the symbol by offset of course. I was also thinking about something like reorder-functions=user approach, but it is extra logic which I would not like to add to bolt it self.

This revision was automatically updated to reflect the committed changes.