This is an archive of the discontinued LLVM Phabricator instance.

[BOLT] Fix handling of code references from unmodified code
ClosedPublic

Authored by maksfb on Jun 10 2023, 2:02 PM.

Details

Summary

In lite mode (default for X86), BOLT optimizes and relocates functions
with profile. The rest of the code is preserved, but if it references
relocated code such references have to be updated. The update is handled
by scanExternalRefs() function. Note that we cannot solely rely on
relocations written by the linker, as not all code references are
exposed to the linker. Additionally, the linker can modify certain
instructions and relocations will no longer match the code.

With this change, start using symbolic disassembler for scanning code
for references in scanExternalRefs(). Unlike the previous approach, the
symbolizer properly detects and creates references for instructions with
multiple/ambiguous symbolic operands and handles cases where a
relocation doesn't match any operand. See test cases for examples.

Diff Detail

Event Timeline

maksfb created this revision.Jun 10 2023, 2:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2023, 2:02 PM
Herald added a subscriber: pengfei. · View Herald Transcript
maksfb requested review of this revision.Jun 10 2023, 2:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2023, 2:02 PM
Amir accepted this revision.Jun 12 2023, 10:06 AM

The change looks reasonable to me. One concern is whether using SymbolicDisasm for all skipped functions has a noticeable runtime penalty - did nfc script report a significant increase?

This revision is now accepted and ready to land.Jun 12 2023, 10:06 AM

I looked at the output of the NFC script and didn't see any warning. Do you have an example of such a warning? I did a manual measurement on a large service and didn't notice any difference in runtime though.

Amir added a comment.Jun 12 2023, 10:26 AM

I looked at the output of the NFC script and didn't see any warning. Do you have an example of such a warning? I did a manual measurement on a large service and didn't notice any difference in runtime though.

By default the script produces timing.log in cwd with output from time -f %e %M, that is, seconds and max rss in Kb, for sides A and B.