This is an archive of the discontinued LLVM Phabricator instance.

[BOLT] Track fragment info for all split fragments
ClosedPublic

Authored by FPar on Aug 17 2022, 11:27 AM.

Details

Summary

To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.

Diff Detail

Event Timeline

FPar created this revision.Aug 17 2022, 11:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2022, 11:27 AM
Herald added a subscriber: ayermolo. · View Herald Transcript
FPar requested review of this revision.Aug 17 2022, 11:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2022, 11:27 AM

Thanks Fabian! Overall, very nice work, although I feel that the diff stack in general has been a bit light on tests that guarantee that the changes have the intended effect. I suggest one test in one of the comments below.

bolt/lib/Core/FunctionLayout.cpp
207–215

Can you explain why was that added a bit better?

bolt/lib/Rewrite/RewriteInstance.cpp
3728–3761

This will write fragments in this order, assuming we have two funcs with three fragments each:

Func1.main_frag Func2.main_frag Func1.frag2 Func1.frag3 Func2.frag2 Func2.frag3

I'm not sure this is what you want? Aren't you looking for grouping all fragments N together? (bear in mind that this code only runs for non-reloc mode, I think reloc mode is correct in grouping all fragments together)

4524–4538

Can we add a simple test to check that this is working as intended?

Mistakes in the symbol table update are nasty because the program will not crash and we may take a long time to detect (since it takes a human to read the symbol table and see that it doesn't make sense). For this case, a test is a good way of making sure this code won't break in the future.

4670–4675

This needs to be aligned with getBinaryFunctionContainingAddress. If this function returned that this address is part of this function, but we can't find it in any of its fragments, then it is an assertion. At this point in the stack, is getBinaryFunctionContainingAddress already considering fragments? (I would imagine so)

FPar updated this revision to Diff 454655.Aug 22 2022, 5:06 PM

Address @rafauler's comments

  • Clarify why main fragment cannot be cleared.
  • Assert not using multiple fragments in non-relocation for mapping code sections.
  • Conservatively checking address ranges when updating symbol section indices of secondoary entrypoints.
FPar added a comment.Aug 22 2022, 5:10 PM

Thanks for the feedback @rafauler!

bolt/lib/Core/FunctionLayout.cpp
207–215

Better?

bolt/lib/Rewrite/RewriteInstance.cpp
3728–3761

You are right. Splitting into more than 2 fragments in non-relocation mode requires more work beyond this. I would rather have that in another diff so for now this checks that the function is split only 2-ways.

4670–4675

getBinaryFunctionContainingAddress considers the original address range of the function, so it is not aware of fragments at all. This is fine though, because at this point Symbol.st_value still has the original value.

I agree about the assert. This checks now explicitly that the secondary entrypoint can be found in any of the function's fragments.

FPar added inline comments.Aug 22 2022, 5:28 PM
bolt/lib/Rewrite/RewriteInstance.cpp
4524–4538

I have added a test at bolt/test/X86/fragmented-symbols.s in D132052 (easier to test in this diff). Does that cover what you were thinking of?

rafauler accepted this revision.Aug 22 2022, 6:44 PM

LGTM

bolt/lib/Core/FunctionLayout.cpp
207–215

Yes, thanks

bolt/lib/Rewrite/RewriteInstance.cpp
4524–4538

Yes, that's great, thanks

This revision is now accepted and ready to land.Aug 22 2022, 6:44 PM
This revision was automatically updated to reflect the committed changes.