This is an archive of the discontinued LLVM Phabricator instance.

[BOLT] AArch64: Emit text objects
ClosedPublic

Authored by yota9 on Mar 20 2022, 5:34 AM.

Details

Summary

BOLT treats aarch64 objects located in text as empty functions with
contant islands. Emit them with at least 8-byte alignment to the new
text section.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Diff Detail

Event Timeline

yota9 created this revision.Mar 20 2022, 5:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2022, 5:34 AM
yota9 requested review of this revision.Mar 20 2022, 5:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2022, 5:34 AM
yota9 updated this revision to Diff 418232.Mar 25 2022, 8:14 AM

Move test from runtime

rafauler accepted this revision.Mar 30 2022, 3:19 PM

Overall LGTM.

The model that BOLT tries to create in memory is one in which every byte in an executable section has an associated BinaryFunction. From the point of view of the processor, an executable section should contain code, and BOLT models all code as the contents of a function.

BOLT deals poorly with data in code: it will mark the function containing data and will avoid processing it. Besides, in X86, it is frequently a bad practice to put data in the code section, as the processor has different caches for instruction and data, and if you put non-instruction bytes in the instruction stream, you will be unnecessarily polluting i-cache.

In AArch64, since it's a RISC processor that absolutely needs to put stuff in the executable sections due to limited range access, I expanded BOLT's IR to consider that a function might have data in code that is easily identifiable via $d symbol markers in the ELF file.

Given this context, an empty function with a constant island (freestanding data in code that is not part of any function) is a curious thing to see. Where is this happening?

bolt/lib/Passes/Aligner.cpp
177

Is this in sync with LongJmp?

bolt/test/AArch64/text_data.c
17

Just curious, how would this ever happen naturally (without forcing the compiler to do so)? If this is a synthetic test, it would be good to have a test that shows the real use case of freestanding data in code. But it's fine if it's not possible (e.g. if this is created as part of a non-C compiler implementing some specific runtime).

This revision is now accepted and ready to land.Mar 30 2022, 3:19 PM

nit: Could you rename text_data.c -> text-data.c ?

yota9 added inline comments.Mar 30 2022, 4:45 PM
bolt/lib/Passes/Aligner.cpp
177
bolt/test/AArch64/text_data.c
17

This is rare case and not a compiler blame. Such things might appear in assembler and examples of such an issues were opened in previous BOLT repos like this: https://github.com/facebookincubator/BOLT/issues/48 . As you said thanks for mapping symbols we can handle such cases a bit better now for aarch64 - by moving the whole object in the newly created text section :)

This revision was automatically updated to reflect the committed changes.