This is an archive of the discontinued LLVM Phabricator instance.

[MC,AArch64] Suppress local symbol to STT_SECTION conversion for GOT relocations
ClosedPublic

Authored by MaskRay on Aug 22 2023, 8:20 PM.

Details

Summary

Assemblers change certain relocations referencing a local symbol to
reference the section symbol instead. This conversion is disabled for
many conditions (shouldRelocateWithSymbol), e.g. TLS symbol, for most
targets (including AArch32, x86, PowerPC, and RISC-V) GOT-generating
relocations.

However, AArch64 encodes the GOT-generating intent in MCValue::RefKind
instead of MCSymbolRef::Kind (see commit
0999cbd0b9ed8aa893cce10d681dec6d54b200ad (2014)), therefore not affected
by the code case MCSymbolRefExpr::VK_GOT:. As GNU ld and ld.lld
create GOT entries based on the symbol, ignoring addend, the two ldr
instructions will share the same GOT entry, which is not expected:

ldr     x1, [x1, :got_lo12:x]  // converted to .data+0
ldr     x1, [x1, :got_lo12:y]  // converted to .data+4

.data
// .globl x, y  would suppress STT_SECTION conversion
x:
.zero 4
y:
.long 42

This patch changes AArch64 to suppress local symbol to STT_SECTION
conversion for GOT relocations, matching most other targets. x and y
will use different GOT entries, which IMO is the most sensable behavior.

With this change, the ABI decision on https://github.com/ARM-software/abi-aa/issues/217
will only affect relocations explicitly referencing STT_SECTION symbols, e.g.

ldr     x1, [x1, :got_lo12:(.data+0)]
ldr     x1, [x1, :got_lo12:(.data+4)]
// I consider this unreasonable uses

IMO all reasonable use cases are unaffected.

Link: https://github.com/llvm/llvm-project/issues/63418
GNU assembler PR: https://sourceware.org/bugzilla/show_bug.cgi?id=30788

Diff Detail

Event Timeline

MaskRay created this revision.Aug 22 2023, 8:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2023, 8:20 PM
MaskRay requested review of this revision.Aug 22 2023, 8:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2023, 8:20 PM
MaskRay added a comment.EditedAug 22 2023, 8:23 PM

Note:

Having both MCValue and MCSymbol parameters in

bool needsRelocateWithSymbol(const MCValue &Val, const MCSymbol &Sym,
                             unsigned Type) const override;

is unfortunate but unavoidable due to .weakref handling.

To not affect .weakref behavior (strange directive, see https://sourceware.org/pipermail/binutils/2020-June/111676.html) on Mips/PowerPC, we need the const MCSymbol &Sym parameter, even if in all the other cases Sym can be easily inferred from const MCValue &Val.

MaskRay added inline comments.Aug 22 2023, 8:26 PM
llvm/test/MC/AArch64/elf-reloc-got-local.s
1 ↗(On Diff #552584)

Sorry, this file is unneeded as the tests are subsumed by arm64-elf-relocs.s. I'll remove this file in a new revision.

MaskRay edited the summary of this revision. (Show Details)Aug 22 2023, 8:34 PM
MaskRay edited the summary of this revision. (Show Details)Aug 22 2023, 8:37 PM

Thanks for the patch. I'm in favour of making the change as this does avoid an awkward corner case in the specification with no obvious drawbacks. A couple of suggestions for extra tests. I think there could be an alternative implementation that doesn't need an extra parameter although if it is possible it does have its own drawbacks.

I'll be out of office till next Tuesday, happy for others to approve in my absence.

llvm/lib/Target/AArch64/MCTargetDesc/AArch64ELFObjectWriter.cpp
467

In theory this won't catch VK_GOTTPREL which would get used by something like:

adrp    x8, :gottprel:foo
ldr     x8, [x8, :gottprel_lo12:foo]

Empirically it looks like TLS relocations to local symbols via GOT expressions does not use a section symbol

        .text
        .globl  f  
        .p2align        2
        .type   f,@function
f:
        adrp    x8, :gottprel:foo
        ldr     x8, [x8, :gottprel_lo12:foo]
        mrs     x9, TPIDR_EL0
        ldr     w0, [x9, x8]
        ret

        .local foo
        .type   foo,@object
        .section        .tbss,"awT",@nobits
        .p2align        2, 0x0
foo:
        .word   0
        .size   foo, 4

generates:

Relocation section '.rela.text' at offset 0x1c0 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  000000040000021d R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21 0000000000000000 foo + 0
0000000000000004  000000040000021e R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC 0000000000000000 foo + 0

Could be worth a test case to make sure it doesn't get broken in the future.

llvm/lib/Target/VE/MCTargetDesc/VEELFObjectWriter.cpp
148

Assuming AArch64 behaves the same way, an alternative implementation that wouldn't need the additional parameter could use the AArch64 equivalent relocation types.

For example

case ELF::R_AARCH64_LD64_GOT_LO12_NC:
...
return true;

This would be a bit tedious to construct (ILP32 as well) wouldn't be as future proof for further GOT expressions, and we couldn't easily test it in full due to not being able to generate the relocations.

llvm/test/MC/AArch64/arm64-elf-relocs.s
319

Can we add a test case for :gotpage_lo15: as well.

        .text
        ldr x24, [x23, #:gotpage_lo15:sym]

        .bss
        .local sym
sym:
        .space 8

Will (without this change) use the section symbol + offset too:

Relocation section '.rela.text' at offset 0xc0 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000300000139 R_AARCH64_LD64_GOTPAGE_LO15 0000000000000000 .bss + 0

There are a handful more got generating operators but these are not currently supported by LLVM only GCC
https://github.com/ARM-software/abi-aa/blob/main/sysvabi64/sysvabi64.rst#64assembler-language-addressing-mode-conventions (GOT operators)

MaskRay marked an inline comment as done.Aug 23 2023, 9:26 AM
MaskRay added inline comments.
llvm/lib/Target/AArch64/MCTargetDesc/AArch64ELFObjectWriter.cpp
467

TLS symbols are handled by if (Flags & ELF::SHF_TLS) and this patch is unrelated to it.

I've tried implementing lld in the past for STT_SECTION but later decided that it's not a good idea. I'll need to study GNU behavior more and rewrite that gold centric comment.

llvm/lib/Target/VE/MCTargetDesc/VEELFObjectWriter.cpp
148

Yes. VK_GOT is associated with many relocation types in AArch64ELFObjectWriter.cpp. Those and ILP32 make me think that we do need an extra parameter const MCValue &Val.

MaskRay marked an inline comment as done.Aug 26 2023, 4:56 PM
MaskRay updated this revision to Diff 553770.Aug 26 2023, 5:19 PM
MaskRay marked an inline comment as done.
peter.smith accepted this revision.Aug 29 2023, 2:48 AM

Thanks for adding the additional tests. LGTM.

This revision is now accepted and ready to land.Aug 29 2023, 2:48 AM
This revision was landed with ongoing or failed builds.Aug 29 2023, 11:07 AM
This revision was automatically updated to reflect the committed changes.