Page MenuHomePhabricator

[MC][ELF] Emit separate unique sections for different flags
ClosedPublic

Authored by tmatheson on Apr 21 2021, 4:41 AM.

Details

Summary
Summary of patch

When multiple symbols are given the same section name, place them into unique sections with the same name but distinguished by both section flags and entity size.

Currently when multiple sections with different flags are implicitly created in IR (for example, a read-only and writable values in the same section) they are emitted in the same section with the flags of the first one seen. Instead, add flags to the ELFUniqueIDMap so that unique sections are emitted for each set of flags.

Ilustrative example

We have an issue with LTO where symbols from separate files are placed in the same section but have incompatible flags:

// file1.c
const volatile int var1 __attribute__((section("foo"))) = 1;

// file2.c
volatile int var2 __attribute__((section("foo"))) = 2;

// file3.c
extern volatile const int var1;
extern volatile int var2;

int main() {
  var2 = 1;
  return var1 + var2;
}

Compiling with clang++ -flto file1.c file2.c file3.c -S --target=arm-unknown-unknown gives the following IR (minimised):

; file1.ll
@var1 = dso_local constant i32 1, section "foo", align 4

; file2.ll
@var2 = dso_local global i32 2, section "foo", align 4

; file3.ll
@var2 = external dso_local global i32, align 4
@var1 = external dso_local constant i32, align 4
; Function Attrs: noinline norecurse nounwind optnone
define dso_local arm_aapcscc i32 @main() #0 {
  %1 = alloca i32, align 4
  store i32 0, i32* %1, align 4
  store volatile i32 1, i32* @var2, align 4
  %2 = load volatile i32, i32* @var1, align 4
  %3 = load volatile i32, i32* @var2, align 4
  %4 = add nsw i32 %2, %3
  ret i32 %4
}

Both var1 and var2 are placed in the the same RO section (clang 12), resulting in a write to RO data:

...
        .type   var1,@object                    # @var1
        .section        foo,"a",@progbits
        .globl  var1
        .p2align        2
var1:
        .long   1                               # 0x1
        .size   var1, 4

        .type   var2,@object                    # @var2
        .globl  var2
        .p2align        2
var2:
        .long   2                               # 0x2
        .size   var2, 4
...
Previous work

This is a based on the work in D72194 which ensures that symbols with different entity sizes are not merged to the same section, by using the unique assembly feature.

We have some mitigation in the frontend for when symbols with incompatible flags are placed in the same section (D93102). However for LTO the conflicting flags come from different translation units and end up in the same LLVM module.

This patch comes out of previous discussion in D93948, where the available options were nicely summarised by @MaskRay. One of the takeaways from that discussion was that the rules for combining flags from different symbols into one set of section flags are unclear/complicated/target dependent, and that outputting separate sections would be a better approach.

Description of the patch

The logic for determining the unique ID had become quite difficult to reason about, so I have factored this out into a separate function. I could put this NFC refactoring up as a separate review if people wanted to see it in isolation.

  • I've kept the terminology of D72194: a "default" section is one created by default by MC, and a "generic" section is one that will be emitted without a unique ID
  • Remove the current behaviour of outputting all sections with a given name with the first seen set of flags
  • Symbols are now placed into separate sections based on the flags, entity size, and section name. This can result in multiple uniqued sections with the same name, but different flags and/or entity size.
  • To try to minimise the number of unique sections, the first section seen with a given name becomes the generic section for that name. It doesn't matter whether it is mergeable or not. Therefore the default sections become generic by virtue of being created first. If all symbols in a given section have the same flags and entity size, only one section will be emitted and no unique is required.
  • The property of not placing mergeable symbols in non-mergeable sections is preserved because they have different flags.

Diff Detail

Event Timeline

tmatheson created this revision.Apr 21 2021, 4:41 AM
tmatheson requested review of this revision.Apr 21 2021, 4:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2021, 4:41 AM

Hi. Can you add some background information to the description. I assume the discussion on https://reviews.llvm.org/D93948 is relevant. Is there an associated bug?

One of the goals when I put in a fix for https://bugs.llvm.org/show_bug.cgi?id=43457 was to try to minimise the circumstances under which MC would have to use a "unique" section. So, I think with this change some of the testcases in explicit-section-mergeable.ll will no longer be relevant e.g. the "Non-allocatable "default" sections can have allocatable mergeable symbols with compatible entry sizes assigned to them" case - and there are probably more. If this review progresses I will try to examine all of the testcases and see what can be simplified.

Hi @bd1976llvm, I was planning to update the description, add reviewers and close the other review once the CI was passing. I'll also update the patch taking your comments into consideration.

You are correct that the discussion on D93948 is what led to this approach. I'm not aware of an associated bug for this.

tmatheson updated this revision to Diff 339958.Apr 23 2021, 2:41 AM
  • [NFCI] Factor out ELF section unique ID calculation into function
  • [CodeGen][ELF] Create unique sections for different flags

I'll update the description next.

tmatheson edited the summary of this revision. (Show Details)Apr 23 2021, 4:42 AM
tmatheson added reviewers: MaskRay, bd1976llvm, psmith.
tmatheson added a subscriber: MaskRay.
tmatheson updated this revision to Diff 339984.Apr 23 2021, 4:45 AM
tmatheson edited the summary of this revision. (Show Details)

clang-format

tmatheson edited the summary of this revision. (Show Details)Apr 26 2021, 11:29 AM
tmatheson updated this revision to Diff 345444.May 14 2021, 8:11 AM

Rewrite the patch after the changes in D101311 went in.

Split refactored UniqueID calulation into a separate review D102336, and
removed all variable/function renaming. This should make the review
significantly easier to.

Also added more tests to elf-unique-sections-by-flags.ll, and made the
existing tests in explicit-section-mergeable.ll more strict.

Changed Orc/RTDyldObjectLinkingLayerTest.cpp to use a non-allocatable
section (.note.GNU-stack) for its check, rather than trying to place an
object in the .debug_str section, which now results in a new .debug_str
section with SHF_ALLOC.

tmatheson retitled this revision from [CodeGen][ELF] Emit separate unique sections for different flags to [MC][ELF] Emit separate unique sections for different flags.

Ping

tmatheson added a subscriber: lhames.

From the history it seems like @lhames might be best suited to comment on the test change

Thanks very much for working on this. I agree with the approach of putting out a separate section when the flags don't match as this matches the non-lto behaviour when the files are compiled separately.

I've left a few thoughts on the RTDyldObjectLinkingLayerTest.cpp test as I think that there may be some assertions that need updating. I think it is reasonable to use a different non-alloc section assuming the purpose of the test is to check that ProcessAllSections controls whether non alloc sections are considered.

@MaskRay do you have any thoughts?
@bd1976llvm did the changes to explicit-section-mergeable.ll match your expectations?

I checked what GCC (10.1.1) does for the LTO case. Looking at the output of --save-temps the GCC LTO code-generator/assembler has merged the section flags prior to passing to the linker. So the linker sees a RW foo with both var1 and var2. This is definitely better than a RO foo which is incorrect.

llvm/test/CodeGen/Generic/elf-unique-sections-by-flags.ll
85

A small suggestion to emphasise that .rodata has been given an unconventional flag.
; Explicitly placed in a .rodata with writeable set. The writable section should be uniqued, not the default ro section, even if it comes first.

105

Suggest
; Multiple .text sections are emitted for read-only and read-write sections using .text name.

llvm/unittests/ExecutionEngine/Orc/RTDyldObjectLinkingLayerTest.cpp
33–37

Worth a comment as to why .note.GNU-stack was chosen? Presumably an arbitrary non-alloc default section.

80

Now that GV->setSection(".debug_str") isn't being called where is it going? Is it a necessary part of the test?

98

I think this should be non alloc section instead of Debug. Possibly use .note.GNU-stack instead of non alloc.

100

Same here, debug -> non alloc or even .note.GNU-stack

This looks good to me as well. The CHECK-LABEL: conversion is unnecessary. I don't think it will improve diagnostics when things go off.
Logically .section .data,"aw",@progbits,unique,1 delimiter the regions. FileCheck then find labels in the regions.

If you use -LABEL:, FileCheck will find labels first. I think the resulting diagnostic is worse than using CHECK:.

...
Changed Orc/RTDyldObjectLinkingLayerTest.cpp to use a non-allocatable
section (.note.GNU-stack) for its check, rather than trying to place an
object in the .debug_str section, which now results in a new .debug_str
section with SHF_ALLOC.

I agree with @peter.smith's suggested changes to the debug logging strings ("debug" -> "non-alloc"). Otherwise the test case change looks good to me.

lhames added inline comments.May 20 2021, 6:08 PM
llvm/unittests/ExecutionEngine/Orc/RTDyldObjectLinkingLayerTest.cpp
33–37

Yes -- that would be a helpful comment. I think your presumption is correct though.

80

I guess that setSection(".debug_str") isn't meaningful any more. I was using it in this test because MC always treated ".debug_str" as non-alloc, but it looks like this patch changes that.

You might be able to remove StrConstant and GV entirely, but I don't know how MC treats empty modules -- do they still reliably generate .note.GNU-stack sections?

If removing StrConstant and GV works then that's great. If removing them causes problems for the test then they can be left as is (or you can substitute some other definition) with a comment that they're just there so that the module is non-empty.

98

Agreed.

100

Agreed.

Does this work with -function-sections as well? If yes, would you please augment the test?

tmatheson updated this revision to Diff 346968.Fri, May 21, 2:53 AM
tmatheson marked an inline comment as done.

Thanks everyone for the comments.

  • RTDyldObjectLinkingLayerTest: fixed assert strings and added clarifying comments.
  • changed CHECK-LABEL -> CHECK in both tests
  • Confirmed this works with -function-sections and added a RUN line and an extra function to the test. .text is still emitted, and the relevant functions are placed in .section .text.XXXX without unique.
  • Clarified some comments as suggested
tmatheson marked 9 inline comments as done.Fri, May 21, 2:57 AM
tmatheson added inline comments.
llvm/unittests/ExecutionEngine/Orc/RTDyldObjectLinkingLayerTest.cpp
33–37

I've added a comment. .not.GNU-stack is the only non-alloc section in the module, at least at the point where ProcessAllSections is applied.

80

I've added a comment. I had tried to remove them but the test failed much earlier with an empty module.

peter.smith accepted this revision.Mon, May 24, 2:21 AM

Thanks very much for the update. I'm happy with the changes. Please wait a couple of days to give the other reviewers a chance for further comments.

This revision is now accepted and ready to land.Mon, May 24, 2:21 AM
This revision was automatically updated to reflect the committed changes.
tmatheson marked 2 inline comments as done.

Thanks very much for working on this. I agree with the approach of putting out a separate section when the flags don't match as this matches the non-lto behaviour when the files are compiled separately.

@bd1976llvm did the changes to explicit-section-mergeable.ll match your expectations?

Sorry for the late reply. I can confirm that the changes to explicit-section-mergeable.ll LGTM - I also confirmed that it passes my downstream testcases that motivated the original https://reviews.llvm.org/D72194. I think this approach should be fine although there *might* be some fallout e.g. from embedded systems that have very simple linkers/loaders (I have no evidence of this though it's just conjecture). As stated on https://reviews.llvm.org/D72194 I think that sometimes the only thing you can do is make a reasonable change and see what breaks - and this looks reasonable to me :)