This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use "l" prefix for data sections under medium/large code model
ClosedPublic

Authored by aeubanks on Apr 20 2023, 1:16 PM.

Details

Summary

And also set the SHF_X86_64_LARGE section flag.

gcc only uses the "l" prefix and SHF_X86_64_LARGE in the medium code model for data larger than -mlarge-data-threshold. But it seems more consistent to use it in the large code model as well in case separate parts of the binary aren't compiled with the large code model and also have a .data/.bss/.rodata section.

Diff Detail

Event Timeline

aeubanks created this revision.Apr 20 2023, 1:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2023, 1:16 PM
aeubanks requested review of this revision.Apr 20 2023, 1:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2023, 1:16 PM
tkoeppe accepted this revision.Apr 20 2023, 1:20 PM
This revision is now accepted and ready to land.Apr 20 2023, 1:20 PM
  1. Largedata is currently x86-only, and so an alternative would be to move some of this into llvm/lib/Target/X86/X86TargetObjectFile.cpp. There is code very much like that in e.g. llvm/lib/Target/RISCV/RISCVTargetObjectFile.cpp in order to select the "small data/bss" sections (riscv "small data" is not really the same thing -- it's for placing some data within 12-bit offsets from _global_pointer in the main executable, so that the linker can delete the load of the upper bits by relying on GP already being loaded) . However, I think that large-data sections is a somewhat generic concept which could be used on other architectures in the same way (e.g. it seems like it could be implemented about the same way and be just as useful for aarch64). So probably implementing it generically is OK.
  1. -mcmodel=large on x86 will need more work for it to be interoperable in the same way that -mcmodel=medium is intended to be. Since it's supposed to allow arbitrarily-large CODE as well, we'd need to also place "large code" into large-code sections, so that you don't break relocations from small-code to small-data (ro or rw), by sticking large-code in between. As such -- until that's going happen, which I think needs a discussion on ABI lists -- I'd like to not change the behavior of -mcmodel=large yet.
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
631

This is fine as you have it in the compiler, but I'll just note that we cannot really do RELRO for .ldata.rel.ro sections in the linker, because today's program loaders only permit a single GNU_RELRO segment -- so .ldata.rel.ro has to be treated just like .ldata in the linker -- and thus stay as RW during program execution.

860

(Assuming we keep all this in the generic code) This should be abstracted so that it can work for !X86 in the future. Or, at least assert(getTargetTriple().getArch() == Triple::x86_64) in this block.

aeubanks updated this revision to Diff 524453.May 22 2023, 1:21 PM

address comments

For what we want to do with -mcmodel=large, I have https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU and CCed some gcc/binutils folks

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
653–656

Use bool IsLarge = ... && ... or avoid the used-once variable.

858

Use &&

2177

The canonical spelling liked by both clang-format and clang-tidy is /*IsLarge=*/false (no space).

llvm/lib/Target/TargetMachine.cpp
47

return getCodeModel() == CodeModel::Medium

llvm/test/CodeGen/X86/code-model-elf-sections.ll
4

It will be useful to test fdata-sections

aeubanks updated this revision to Diff 524869.May 23 2023, 1:44 PM

address comments

aeubanks marked 2 inline comments as done.May 23 2023, 1:44 PM
aeubanks added inline comments.
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
653–656

the reason for this and the other style choices is that D149288 (which I'll submit alongside this patch) will expand upon these

MaskRay accepted this revision.May 24 2023, 7:46 PM

Looks great! According to https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU , -mcmodel=large sections should be able to use .ldata as well, even though GCC hasn't made the change yet.

llvm/test/CodeGen/X86/code-model-elf-sections.ll
15

Consider llvm-readelf -S. The tabular output is easier to read.

aeubanks updated this revision to Diff 525781.May 25 2023, 1:48 PM

use llvm-readelf instead of llvm-readobj

aeubanks updated this revision to Diff 525789.May 25 2023, 2:05 PM

update test