This is an archive of the discontinued LLVM Phabricator instance.

[MC][ELF] Emit separate unique sections for different flags
ClosedPublic

Authored by tmatheson on Apr 21 2021, 4:41 AM.

Details

Summary
Summary of patch

When multiple symbols are given the same section name, place them into unique sections with the same name but distinguished by both section flags and entity size.

Currently when multiple sections with different flags are implicitly created in IR (for example, a read-only and writable values in the same section) they are emitted in the same section with the flags of the first one seen. Instead, add flags to the ELFUniqueIDMap so that unique sections are emitted for each set of flags.

Ilustrative example

We have an issue with LTO where symbols from separate files are placed in the same section but have incompatible flags:

// file1.c
const volatile int var1 __attribute__((section("foo"))) = 1;

// file2.c
volatile int var2 __attribute__((section("foo"))) = 2;

// file3.c
extern volatile const int var1;
extern volatile int var2;

int main() {
  var2 = 1;
  return var1 + var2;
}

Compiling with clang++ -flto file1.c file2.c file3.c -S --target=arm-unknown-unknown gives the following IR (minimised):

; file1.ll
@var1 = dso_local constant i32 1, section "foo", align 4

; file2.ll
@var2 = dso_local global i32 2, section "foo", align 4

; file3.ll
@var2 = external dso_local global i32, align 4
@var1 = external dso_local constant i32, align 4
; Function Attrs: noinline norecurse nounwind optnone
define dso_local arm_aapcscc i32 @main() #0 {
  %1 = alloca i32, align 4
  store i32 0, i32* %1, align 4
  store volatile i32 1, i32* @var2, align 4
  %2 = load volatile i32, i32* @var1, align 4
  %3 = load volatile i32, i32* @var2, align 4
  %4 = add nsw i32 %2, %3
  ret i32 %4
}

Both var1 and var2 are placed in the the same RO section (clang 12), resulting in a write to RO data:

...
        .type   var1,@object                    # @var1
        .section        foo,"a",@progbits
        .globl  var1
        .p2align        2
var1:
        .long   1                               # 0x1
        .size   var1, 4

        .type   var2,@object                    # @var2
        .globl  var2
        .p2align        2
var2:
        .long   2                               # 0x2
        .size   var2, 4
...
Previous work

This is a based on the work in D72194 which ensures that symbols with different entity sizes are not merged to the same section, by using the unique assembly feature.

We have some mitigation in the frontend for when symbols with incompatible flags are placed in the same section (D93102). However for LTO the conflicting flags come from different translation units and end up in the same LLVM module.

This patch comes out of previous discussion in D93948, where the available options were nicely summarised by @MaskRay. One of the takeaways from that discussion was that the rules for combining flags from different symbols into one set of section flags are unclear/complicated/target dependent, and that outputting separate sections would be a better approach.

Description of the patch

The logic for determining the unique ID had become quite difficult to reason about, so I have factored this out into a separate function. I could put this NFC refactoring up as a separate review if people wanted to see it in isolation.

  • I've kept the terminology of D72194: a "default" section is one created by default by MC, and a "generic" section is one that will be emitted without a unique ID
  • Remove the current behaviour of outputting all sections with a given name with the first seen set of flags
  • Symbols are now placed into separate sections based on the flags, entity size, and section name. This can result in multiple uniqued sections with the same name, but different flags and/or entity size.
  • To try to minimise the number of unique sections, the first section seen with a given name becomes the generic section for that name. It doesn't matter whether it is mergeable or not. Therefore the default sections become generic by virtue of being created first. If all symbols in a given section have the same flags and entity size, only one section will be emitted and no unique is required.
  • The property of not placing mergeable symbols in non-mergeable sections is preserved because they have different flags.

Diff Detail

Event Timeline

tmatheson created this revision.Apr 21 2021, 4:41 AM
tmatheson requested review of this revision.Apr 21 2021, 4:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2021, 4:41 AM

Hi. Can you add some background information to the description. I assume the discussion on https://reviews.llvm.org/D93948 is relevant. Is there an associated bug?

One of the goals when I put in a fix for https://bugs.llvm.org/show_bug.cgi?id=43457 was to try to minimise the circumstances under which MC would have to use a "unique" section. So, I think with this change some of the testcases in explicit-section-mergeable.ll will no longer be relevant e.g. the "Non-allocatable "default" sections can have allocatable mergeable symbols with compatible entry sizes assigned to them" case - and there are probably more. If this review progresses I will try to examine all of the testcases and see what can be simplified.

Hi @bd1976llvm, I was planning to update the description, add reviewers and close the other review once the CI was passing. I'll also update the patch taking your comments into consideration.

You are correct that the discussion on D93948 is what led to this approach. I'm not aware of an associated bug for this.

tmatheson updated this revision to Diff 339958.Apr 23 2021, 2:41 AM
  • [NFCI] Factor out ELF section unique ID calculation into function
  • [CodeGen][ELF] Create unique sections for different flags

I'll update the description next.

tmatheson edited the summary of this revision. (Show Details)Apr 23 2021, 4:42 AM
tmatheson added reviewers: MaskRay, bd1976llvm, psmith.
tmatheson added a subscriber: MaskRay.
tmatheson updated this revision to Diff 339984.Apr 23 2021, 4:45 AM
tmatheson edited the summary of this revision. (Show Details)

clang-format

tmatheson edited the summary of this revision. (Show Details)Apr 26 2021, 11:29 AM
tmatheson updated this revision to Diff 345444.May 14 2021, 8:11 AM

Rewrite the patch after the changes in D101311 went in.

Split refactored UniqueID calulation into a separate review D102336, and
removed all variable/function renaming. This should make the review
significantly easier to.

Also added more tests to elf-unique-sections-by-flags.ll, and made the
existing tests in explicit-section-mergeable.ll more strict.

Changed Orc/RTDyldObjectLinkingLayerTest.cpp to use a non-allocatable
section (.note.GNU-stack) for its check, rather than trying to place an
object in the .debug_str section, which now results in a new .debug_str
section with SHF_ALLOC.

tmatheson retitled this revision from [CodeGen][ELF] Emit separate unique sections for different flags to [MC][ELF] Emit separate unique sections for different flags.

Ping

tmatheson added a subscriber: lhames.

From the history it seems like @lhames might be best suited to comment on the test change

Thanks very much for working on this. I agree with the approach of putting out a separate section when the flags don't match as this matches the non-lto behaviour when the files are compiled separately.

I've left a few thoughts on the RTDyldObjectLinkingLayerTest.cpp test as I think that there may be some assertions that need updating. I think it is reasonable to use a different non-alloc section assuming the purpose of the test is to check that ProcessAllSections controls whether non alloc sections are considered.

@MaskRay do you have any thoughts?
@bd1976llvm did the changes to explicit-section-mergeable.ll match your expectations?

I checked what GCC (10.1.1) does for the LTO case. Looking at the output of --save-temps the GCC LTO code-generator/assembler has merged the section flags prior to passing to the linker. So the linker sees a RW foo with both var1 and var2. This is definitely better than a RO foo which is incorrect.

llvm/test/CodeGen/Generic/elf-unique-sections-by-flags.ll
85

A small suggestion to emphasise that .rodata has been given an unconventional flag.
; Explicitly placed in a .rodata with writeable set. The writable section should be uniqued, not the default ro section, even if it comes first.

105

Suggest
; Multiple .text sections are emitted for read-only and read-write sections using .text name.

llvm/unittests/ExecutionEngine/Orc/RTDyldObjectLinkingLayerTest.cpp
33 ↗(On Diff #345447)

Worth a comment as to why .note.GNU-stack was chosen? Presumably an arbitrary non-alloc default section.

74 ↗(On Diff #345447)

Now that GV->setSection(".debug_str") isn't being called where is it going? Is it a necessary part of the test?

92 ↗(On Diff #345447)

I think this should be non alloc section instead of Debug. Possibly use .note.GNU-stack instead of non alloc.

94 ↗(On Diff #345447)

Same here, debug -> non alloc or even .note.GNU-stack

This looks good to me as well. The CHECK-LABEL: conversion is unnecessary. I don't think it will improve diagnostics when things go off.
Logically .section .data,"aw",@progbits,unique,1 delimiter the regions. FileCheck then find labels in the regions.

If you use -LABEL:, FileCheck will find labels first. I think the resulting diagnostic is worse than using CHECK:.

...
Changed Orc/RTDyldObjectLinkingLayerTest.cpp to use a non-allocatable
section (.note.GNU-stack) for its check, rather than trying to place an
object in the .debug_str section, which now results in a new .debug_str
section with SHF_ALLOC.

I agree with @peter.smith's suggested changes to the debug logging strings ("debug" -> "non-alloc"). Otherwise the test case change looks good to me.

lhames added inline comments.May 20 2021, 6:08 PM
llvm/unittests/ExecutionEngine/Orc/RTDyldObjectLinkingLayerTest.cpp
33 ↗(On Diff #345447)

Yes -- that would be a helpful comment. I think your presumption is correct though.

74 ↗(On Diff #345447)

I guess that setSection(".debug_str") isn't meaningful any more. I was using it in this test because MC always treated ".debug_str" as non-alloc, but it looks like this patch changes that.

You might be able to remove StrConstant and GV entirely, but I don't know how MC treats empty modules -- do they still reliably generate .note.GNU-stack sections?

If removing StrConstant and GV works then that's great. If removing them causes problems for the test then they can be left as is (or you can substitute some other definition) with a comment that they're just there so that the module is non-empty.

92 ↗(On Diff #345447)

Agreed.

94 ↗(On Diff #345447)

Agreed.

Does this work with -function-sections as well? If yes, would you please augment the test?

tmatheson updated this revision to Diff 346968.May 21 2021, 2:53 AM
tmatheson marked an inline comment as done.

Thanks everyone for the comments.

  • RTDyldObjectLinkingLayerTest: fixed assert strings and added clarifying comments.
  • changed CHECK-LABEL -> CHECK in both tests
  • Confirmed this works with -function-sections and added a RUN line and an extra function to the test. .text is still emitted, and the relevant functions are placed in .section .text.XXXX without unique.
  • Clarified some comments as suggested
tmatheson marked 9 inline comments as done.May 21 2021, 2:57 AM
tmatheson added inline comments.
llvm/unittests/ExecutionEngine/Orc/RTDyldObjectLinkingLayerTest.cpp
33 ↗(On Diff #345447)

I've added a comment. .not.GNU-stack is the only non-alloc section in the module, at least at the point where ProcessAllSections is applied.

74 ↗(On Diff #345447)

I've added a comment. I had tried to remove them but the test failed much earlier with an empty module.

peter.smith accepted this revision.May 24 2021, 2:21 AM

Thanks very much for the update. I'm happy with the changes. Please wait a couple of days to give the other reviewers a chance for further comments.

This revision is now accepted and ready to land.May 24 2021, 2:21 AM
This revision was automatically updated to reflect the committed changes.
tmatheson marked 2 inline comments as done.

Thanks very much for working on this. I agree with the approach of putting out a separate section when the flags don't match as this matches the non-lto behaviour when the files are compiled separately.

@bd1976llvm did the changes to explicit-section-mergeable.ll match your expectations?

Sorry for the late reply. I can confirm that the changes to explicit-section-mergeable.ll LGTM - I also confirmed that it passes my downstream testcases that motivated the original https://reviews.llvm.org/D72194. I think this approach should be fine although there *might* be some fallout e.g. from embedded systems that have very simple linkers/loaders (I have no evidence of this though it's just conjecture). As stated on https://reviews.llvm.org/D72194 I think that sometimes the only thing you can do is make a reasonable change and see what breaks - and this looks reasonable to me :)

We have found a problem whist testing this change. We have some internal code that was exploiting the old behaviour:

  1. Creates a non-ALLOC section: "inline asm: _asm_(".section debug_special,\"\",@progbits");"
  2. Assigns symbols to that section: _attribute((used, section("debug_special")))
  3. Uses a linkscript so that the symbols in the section are offsets from the start of the section and the section is not assigned to a phdr: debug_special 0 : { KEEP(*(debug_special)) } : NONE

With this change that no longer works. Instead, the linker creates an ALLOC debug_special section and then that section overlaps with another at address 0 and LLD errors: "lld: error: section debug_special virtual address range overlaps with ...".

I still think that this change is reasonable; but, I'm not sure how to support this code that was exploiting the old behaviour. It might be possible to find a work-around; or, perhaps, to make a fix in another tool. Maybe we could add a linkerscript feature to force the input sections to an output section to be considered non-ALLOC e.g. NOADDR (analogous to the existing NOLOAD).

Not sure I fully understand the issue, but it sounds like the final combined debug_special section the linker is trying to create is allocatable at address 0, whereas before it was non-allocatable at address zero. Therefore now it conflicts with other allocatable sections at the same address.

The solution looks like one of the following to me:

  • Use a linkerscript that enforces non-alloc for debug_special. I don't know if either ld or lld support setting flags explicitly.
  • Allow the user to specify flags through the frontend, or at least in IR.

Users can specify section flags in the attribute with GCC by exploiting string quoting, e.g. __attribute__((section("debug_special,\"\",@progbits#"))) in which the section name is actually given the desired assembly followed by a comment character #, effectively commenting out whatever assembly GCC was going to output for the section and using the user supplied string instead.

A better solution for llvm would be to expand the __attribute__((section(name))) to accept flags, and to expand the LLVM IR section to allow specifying flags, section type, etc. If the attribute can't be changed then being able to express this in IR would still enable clang to make more sensible choices.

Thanks for replying with some potential solutions. I'm sorry for not raising this earlier we have been slow to keep up with upstream changes recently so this change didn't get merged and tested until last week.

Not sure I fully understand the issue, but it sounds like the final combined debug_special section the linker is trying to create is allocatable at address 0, whereas before it was non-allocatable at address zero. Therefore now it conflicts with other allocatable sections at the same address.

Exactly.

The solution looks like one of the following to me:

  • Use a linkerscript that enforces non-alloc for debug_special. I don't know if either ld or lld support setting flags explicitly.
  • Allow the user to specify flags through the frontend, or at least in IR.

Users can specify section flags in the attribute with GCC by exploiting string quoting, e.g. __attribute__((section("debug_special,\"\",@progbits#"))) in which the section name is actually given the desired assembly followed by a comment character #, effectively commenting out whatever assembly GCC was going to output for the section and using the user supplied string instead.

Thanks - I didn't know about this. What a hack!!! Sadly (or maybe not sadly) it doesn't seem to work with Clang: https://godbolt.org/z/YhjGPn4qE.

A better solution for llvm would be to expand the __attribute__((section(name))) to accept flags, and to expand the LLVM IR section to allow specifying flags, section type, etc. If the attribute can't be changed then being able to express this in IR would still enable clang to make more sensible choices.

Right - much better than string trickery :)

Another potential solution I thought of is to bring back the behaviour for "default" sections (see the behaviour change in llvm/test/CodeGen/X86/explicit-section-mergeable.ll). "default" sections, created by the assembler upfront, tend to correspond to sections where the name implies the flags e.g. .debug_info/line etc... so in these cases it is reasonable to assign symbols with different flags to "default" sections rather than creating a new unique section. In our downstream toolchain I could then make debug_special a "default" non-ALLOC section.

Actually, here's the full list of possible fixes I have considered:

  • linkerscript feature to force the input sections to an output section to be considered non-ALLOC e.g. NOADDR (analogous to the existing NOLOAD).
  • Linker doesn't check for overlap if a section is assigned to the NONE phdr.
  • Modify https://reviews.llvm.org/D100944 not to use unique sections for assignments to "default" sections and then make debug_special section a "default" section for our toolchain.
  • Augment compiler section attribute to allow specifying the section flags as well as the section name.
  • Assembler feature to override the section flags for specified section.
  • Add linker feature to disable address overlap checks for specific sections (currently there is -no-check-sections but it disables checks on all sections).

@MaskRay what do you think about a solution in the linker?

Augment compiler section attribute to allow specifying the section flags as well as the section name.

Recent GNU as reports errors for changed section attributes, so gcc with such a GNU as will report an error

% cat a.c
asm(".section debug_special,\"\",@progbits");
__attribute__((used, section("debug_special"))) int var;

% gcc -c a.c
/tmp/ccrwXjUO.s: Assembler messages:
/tmp/ccrwXjUO.s:7: Error: changed section attributes for debug_special

So I think a compiler oriented solution is more appropriate than a linker oriented one.
A GNU attribute beside section looks good to me. Such a request needed to be raised on the GCC Bugzilla.

linkerscript feature to force the input sections to an output section to be considered non-ALLOC e.g. NOADDR (analogous to the existing NOLOAD).

If NOADDR (customizing the flags of an output section) is desired, it needs to be raised on binutils feature request.
It is a bit possible that such a thing already exists and more importantly, it is likely they have opinions on what this should be called.

Assembler feature to override the section flags for specified section.

I think this has a low probability that this may be accepted on GNU side.
Assemblers are in streamer styles, and a later directive overriding previous directives contradicts with GNU as's design.

When SHF_GNU_RETAIN was introduced, the author considered .retain as a directive to modify a section attribute and it was rejected.

Linker doesn't check for overlap if a section is assigned to the NONE phdr.

s/NONE/non-SHF_ALLOC/ ?

A foo without SHF_ALLOC and a foo with SHF_ALLOC are combined into an output foo with SHF_ALLOC.
So I don't see how such a special case can help.

Add linker feature to disable address overlap checks for specific sections (currently there is -no-check-sections but it disables checks on all sections).

Looks like you have a workaround --no-check-sections for now.
As a workaround (and considering the potential usage), I think adding such a capatibility to --no-check-sections is likely overengineering.
The similar check can be performed by a post-link tool which checks section addresses.
Such a tool is likely already in place if you need a verification tool (or watermark or the like).

Augment compiler section attribute to allow specifying the section flags as well as the section name.

Recent GNU as reports errors for changed section attributes, so gcc with such a GNU as will report an error

% cat a.c
asm(".section debug_special,\"\",@progbits");
__attribute__((used, section("debug_special"))) int var;

% gcc -c a.c
/tmp/ccrwXjUO.s: Assembler messages:
/tmp/ccrwXjUO.s:7: Error: changed section attributes for debug_special

So I think a compiler oriented solution is more appropriate than a linker oriented one.
A GNU attribute beside section looks good to me. Such a request needed to be raised on the GCC Bugzilla.

Thanks for the feedback, very much appreciated. I agree that a compiler based solution is more appropriate.

We appear to have the only codebase with this issue. Therefore, we are currently looking hard to see if there is any way to rewrite or remove the feature.

If this is not possible we will persue a new toolchain feature to support our use case.

Augment compiler section attribute to allow specifying the section flags as well as the section name.

Recent GNU as reports errors for changed section attributes, so gcc with such a GNU as will report an error

% cat a.c
asm(".section debug_special,\"\",@progbits");
__attribute__((used, section("debug_special"))) int var;

% gcc -c a.c
/tmp/ccrwXjUO.s: Assembler messages:
/tmp/ccrwXjUO.s:7: Error: changed section attributes for debug_special

So I think a compiler oriented solution is more appropriate than a linker oriented one.
A GNU attribute beside section looks good to me. Such a request needed to be raised on the GCC Bugzilla.

Thanks for the feedback, very much appreciated. I agree that a compiler based solution is more appropriate.

We appear to have the only codebase with this issue. Therefore, we are currently looking hard to see if there is any way to rewrite or remove the feature.

If this is not possible we will persue a new toolchain feature to support our use case.

If there is widespread usage, you may consider llvm-objcopy --set-section-flags=debug_special=readonly