This is an archive of the discontinued LLVM Phabricator instance.

[MC][ELF] Prevent globals with an explicit section from being mergeable
AbandonedPublic

Authored by bd1976llvm on Sep 26 2019, 1:07 PM.

Diff Detail

Event Timeline

bd1976llvm created this revision.Sep 26 2019, 1:07 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 26 2019, 1:07 PM

This feels like it could cause a pretty serious regression. This essentially disables global merging with -fdata-sections, which I know at least one linker relies upon for code size.

It'd be good to get comments from the ARM folk, I've also added Sjoerd who I know has background in this.

Hello James! Thanks for informing us. If this does what you say, it essentially disables global merging with -fdata-sections, then indeed we would get a little bit unhappy about that.
I could run some numbers with this patch, but I can guess what the outcome will be..... I haven't looked at this patch or the PR, but does this need to be default behaviour, or can this e.g. be done under an option?

This feels like it could cause a pretty serious regression. This essentially disables global merging with -fdata-sections, which I know at least one linker relies upon for code size.

My understanding is that -fdata-sections is just an option that affects the section selection for a global. My intent was that this change would only affect globals that have been assigned to a specific section via an attribute or the section property... to put it another way there appears to ab no effect on the IR from adding -fdata-sections to clang so I don't understand how this change could effect this... no?

Hello James! Thanks for informing us. If this does what you say, it essentially disables global merging with -fdata-sections, then indeed we would get a little bit unhappy about that.
I could run some numbers with this patch, but I can guess what the outcome will be..... I haven't looked at this patch or the PR, but does this need to be default behaviour, or can this e.g. be done under an option?

I can see that this could cause performance problems if you were relying on the merging. I feel strongly that the behaviour in this patch (which matches GCC) should be the default, however I don't really have an objection to adding an option to re-enable merging. Please have a look at https://reviews.llvm.org/D68147 where I have attempted to do that.

I am getting up to speed with this... and created this reproducer:

static int A[4] __attribute__ ((section ("HELLO"))) = {1,0,0,1} ;
static long long B[9]  __attribute__ ((section ("HELLO"))) = {0,0,0,0,0,0,0,0,1};
int foo (int i) {
  return A[i] + B[i];
}

The corresponding ELF object contains this section:

Name        : HELLO
Type        : SHT_PROGBITS (0x00000001)
Flags       : SHF_ALLOC + SHF_MERGE (0x00000012)
Addr        : 0x00000000
File Offset : 96 (0x60)
Size        : 88 bytes (0x58)
Link        : SHN_UNDEF
Info        : 0
Alignment   : 8
Entry Size  : 16

You're unhappy about the Entry Size : 16, as this sections contains entries of words and double-words. The ELF spec says this about this entry:

The size of each element is specified in the section header's sh_entsize field.

So yes, I guess that means the entry size needs to be uniform.

Now my question about this patch: why are we trying to patch things up when we can avoid putting incompatible symbols in the same section in the first place? In this case, the merging is achieved with a section attribute, but from the PR I understood the same can be achieved with a pragma, so from that point of view we perhaps got what we deserved? Would be nice though to warn about this, and/or the pass that does the merging should not do this?

Hi Sjoerd,

Nice reproducer.

In this case, the merging is achieved with a section attribute, but from the PR I understood the same can be achieved with a pragma,

The frontend attribute/pragmas that I am aware of are:

#pragma clang section ... - https://clang.llvm.org/docs/LanguageExtensions.html#specifying-section-names-for-global-objects-pragma-clang-section
and
_attribute_ ((section ("section-name"))) - https://clang.llvm.org/docs/AttributeReference.html#section-declspec-allocate

Now my question about this patch: why are we trying to patch things up when we can avoid putting incompatible symbols in the same section in the first place? In this case, the merging is achieved with a section attribute, but from the PR I understood the same can be achieved with a pragma, so from that point of view we perhaps got what we deserved? Would be nice though to warn about this, and/or the pass that does the merging should not do this?

This and https://reviews.llvm.org/D68147 are attempts at minimal patches to address just the mergeable globals issue. The larger issue is how should incompatible symbols and attribute ((section ("<section name>")) interact? I have outlined some possible fixes on the bugzilla; but, the problem is that I am not sure of the intent of the attribute. It seems reasonable that in the case of your example the user expected a single HELLO section in the output containing both symbols. With this interpretation our choices are to try to create such a section or fail if we can't. However, perhaps the user would be happy with two sections called HELLO in the output.. in which case it we could create multiple output sections with the name HELLO each containing a set of compatible symbols. So basically to proceed we need to decide on the correct semantics for this attribute. We also need to evaluate real world use cases for this attributes.. e.g. I suspect this is often combined with inline asm... in these cases users might be relying on implementation specific behaviour and fixing this bug might break their code. Also the design of MC makes some fixes difficult to implement, e.g. it is single pass so I can't fix up the section attributes when I encounter an incompatible global. I think a full fix for this issue is difficult which is why I have only put up a fix for the mergable part of the bugzilla.

As to why I am attempting to patch things up rather than do something early... in this case there is no optimizer pass that does this merging optimization. It is an MC optimization. AFAIK the fix needs to be in MC. I think that this patch or https://reviews.llvm.org/D68147 are reasonable attempts at a fix since they get us closer to the GCC behaviour which seems to be:

Create a single section if possible: https://godbolt.org/z/yKVNUd
Error if it is not possible: https://godbolt.org/z/rn7o7a

p.s. GCC's error messages are better than what are produced by https://reviews.llvm.org/D68147; but, I don't seem to have access to enough information in MC to produce a better message.

Hi Ben, this is not really my area of expertise, but it all starts to make some sense to me. I was expecting this transformation to happen earlier, but this is where the magic happens, and this probably where it belongs.
Just to make sure I haven't missed anything, I would like to take this patch and run some numbers, for which I need a little bit of time. If in the mean time someone with some more experience in this area here has a look too, that would be great....

bd1976llvm added a subscriber: mcgrathr.EditedOct 1 2019, 10:15 AM

Hi Ben, this is not really my area of expertise, but it all starts to make some sense to me. I was expecting this transformation to happen earlier, but this is where the magic happens, and this probably where it belongs.
Just to make sure I haven't missed anything, I would like to take this patch and run some numbers, for which I need a little bit of time. If in the mean time someone with some more experience in this area here has a look too, that would be great....

Thanks. I am not sure who can review this. I added people form the "#pragma clang section" reviews. I will add @rjmccall as this is a frontend interacting with linking issue.

bd1976llvm added a subscriber: cfe-commits.

Just to make sure I haven't missed anything, I would like to take this patch and run some numbers, for which I need a little bit of time.

Hi @SjoerdMeijer. Did you have a chance to run the numbers? I am keen to get a fix in for at least the SHF_MERGE issue as we have customers who have hit this case. If you haven't had time I will work on a lower risk patch (lower risk of performance impact). Thanks.

Let me try to restate what's happening here so that I can see if I understand it.

There are two concepts of "section" in play here:

  • a unit in an object file with a particular section name, which need not be unique within the object file, and which can have interesting per-unit attributes like mergeability; and
  • the high-level layout of the final linked image, where all of the units with the same section name (within a single image) are supposed to be contiguous, and where the image will contain metadata describing the location and size of the section.

I'll call the first a "section unit" and the second an "image section" just for clarity; my apologies if there's more standard jargon.

Marking a section unit as mergeable in ELF opts in to a link-time optimization where the linker can avoid unnecessary duplication in the image section by combining identical data that would have ended up in it. Essentially, the section unit is broken up into entries according to an entry size that's part of the mergeable attribute, and if the linker sees that two entries will be identical (whether they come from different section units or not), it can simply remove the redundant entry from the final image section. (Presumably there's some rule about the order of entries, but it doesn't really matter for this analysis.) This is done as a relatively late pass; any sort of mandatory "merging" from e.g. COMDAT, weak, and common symbols will already have been applied, so we don't need to worry about this interfering with language-mandated symbol coalescing.

When LLVM is emitting ELF, it will try to place an object in a mergeable section unit if the object is unnamed_addr. It will also generally emit objects into the same section unit if they share the same section name. I assume this takes attributes into account to at least some degree, or else we might be lumping non-unnamed_addr into a mergeable section just because the first object we processed with that section name was unnamed_addr. But it must not take entry size into account, because PR 43457 shows us clearly emitting a single section unit for objects of different sizes.

Given all that, this patch seems far too aggressive. While mergeable sections can be useful for optimizing arbitrary code that might not use a section, they are also extremely useful for optimizing the sorts of global tables that programmers frequently use explicit sections for. It seems to me that the right fix is to find the place that ensures that we don't put mergeable and non-mergeable objects in the same section unit (or at least conservatively makes the section unit non-mergeable) and fix it to consider entry size as well. That should be straightforward unless that place doesn't exist, in which case we have very serious problems.

Given all that, this patch seems far too aggressive. While mergeable sections can be useful for optimizing arbitrary code that might not use a section, they are also extremely useful for optimizing the sorts of global tables that programmers frequently use explicit sections for. It seems to me that the right fix is to find the place that ensures that we don't put mergeable and non-mergeable objects in the same section unit (or at least conservatively makes the section unit non-mergeable) and fix it to consider entry size as well. That should be straightforward unless that place doesn't exist, in which case we have very serious problems.

Disagree (but I spent all day thinking about this). Going back to https://bugs.llvm.org/show_bug.cgi?id=31828, I think I solved the problem there incorrectly; we should be not marking the explicit section merge-able.

Consider: https://godbolt.org/z/6sF_kc (GCC will put a and c in a non-mergeable section).

https://www.sco.com/developers/gabi/2003-12-17/ch4.sheader.html mentions:

Each element in the section is compared against other elements in sections with the same name, type and flags. Elements that would have identical values at program run-time may be merged. Relocations referencing elements of such sections must be resolved to the merged locations of the referenced values. Note that any relocatable values, including values that would result in run-time relocations, must be analyzed to determine whether the run-time values would actually be identical.

So I don't think we should even be trying to mark sections as mergeable unless we walk all elements in the section and make sure it's even safe to apply these.

This feels like it could cause a pretty serious regression. This essentially disables global merging with -fdata-sections, which I know at least one linker relies upon for code size.

Does it? Surely there's a test that would fail with this patch applied?

llvm/test/CodeGen/X86/explict-section-mergeable.ll
21

These CHECK-NOTs should be more explicit that the the section is not merge-able (M).

22

Any new tests should be added to llvm/test/CodeGen/Generic/section_mergeable_size.ll IMO. I assume this test changes that test's behavior?

So I don't think we should even be trying to mark sections as mergeable unless we walk all elements in the section and make sure it's even safe to apply these.

That's the conservative fix, yes. You can either:

  1. emit global objects with the same section name into a single section unit and then set the mergeable flag and entry size for that unit only if all the objects were mergeable and the same size, or
  2. emit global objects with the same section name into different section units based on their mergeability and entry size.

If the code isn't already set up to emit multiple section units with a name — which implies that there may also be an existing bug based on having mergeable and non-mergeable symbols with the same section name — then the former is probably much simpler. But if it's already set up to split into two section units, I can't imagine it'd be that difficult to split into more than two.

Given all that, this patch seems far too aggressive. While mergeable sections can be useful for optimizing arbitrary code that might not use a section, they are also extremely useful for optimizing the sorts of global tables that programmers frequently use explicit sections for. It seems to me that the right fix is to find the place that ensures that we don't put mergeable and non-mergeable objects in the same section unit (or at least conservatively makes the section unit non-mergeable) and fix it to consider entry size as well. That should be straightforward unless that place doesn't exist, in which case we have very serious problems.

Disagree (but I spent all day thinking about this). Going back to https://bugs.llvm.org/show_bug.cgi?id=31828, I think I solved the problem there incorrectly; we should be not marking the explicit section merge-able.

Thanks for the comments!

I am convinced that the current patch is too pessimistic, size optimizations can be functional requirements in memory constrained areas.

There are two frontend attributes involved:

pragma clang section ... - https://clang.llvm.org/docs/LanguageExtensions.html#specifying-section-names-for-global-objects-pragma-clang-section
_attribute_ ((section ("section-name"))) - https://clang.llvm.org/docs/AttributeReference.html#section-declspec-allocate

At the IR level you can query for explicit sections assignment with code like:

static bool inExplicitSection(const GlobalObject *GO, SectionKind Kind) {
  if (GO->hasSection())
    return true;

  if (auto *GVar = dyn_cast<GlobalVariable>(GO)) {
    auto Attrs = GVar->getAttributes();
    if ((Attrs.hasAttribute("bss-section") && Kind.isBSS()) ||
        (Attrs.hasAttribute("data-section") && Kind.isData()) ||
        (Attrs.hasAttribute("rodata-section") && Kind.isReadOnly())) {
      return true;
    }
  }

  if (auto *F = dyn_cast<Function>(GO)) {
    if (F->hasFnAttribute("implicit-section-name"))
      return true;
  }

  return false;
}

The section of a global is used for _attribute_ ((section ("section-name"))) but it is also set in LLVM e.g. by optimization passes.
Attributes "bss-section", "data-section", "rodata-section" and "implicit-section-name" are only used for "clang section" pragmas.

As I commented upthread I would like input from the authors of the "clang section" pragma to understand whether the semantics of that pragma would allow for generating multiple output sections (with the same name). Assuming that this pragma implies a single output section I have an new proposal for a less pessimistic fix:

  1. Sections generated for "clang section" pragmas are never made mergeable.
  2. It is an error if incompatible globals are assigned to a section with the same name.

This works as the user can easily fix any "_attribute_ ((section ("section-name")))" in their source code and internal uses in LLVM code can also make sure that incompatible globals are assigned to differently named sections. Thoughts? I'll update the patch shortly unless there are objections.

I can't speak for the pragma authors, but I strongly doubt the pragma is intended to force all affected globals to go into a single section unit, since the division into section units is purely an object-file representation issue.

Looks to me like MCContext::getELFSection should be including the entry size and flags as part of the uniquing key.

I can't speak for the pragma authors, but I strongly doubt the pragma is intended to force all affected globals to go into a single section unit, since the division into section units is purely an object-file representation issue.

I'm thinking of embedded platforms where there is no or only very primitive linkers. There is also the advantage of producing similar output to GCC. Creating multiple output sections is not without risk - for example, it means that symbols can be re-ordered (relative to their original order in source files) which could cause a change in behaviour.

Perhaps I am being too cautious - we have had bugs and reviews open for long enough for anyone interested to comment. The way forward now, I think, is to make a reasonable fix and see what breaks.

Looks to me like MCContext::getELFSection should be including the entry size and flags as part of the uniquing key.

Agreed. I will update the patch to do that.

I can't speak for the pragma authors, but I strongly doubt the pragma is intended to force all affected globals to go into a single section unit, since the division into section units is purely an object-file representation issue.

I'm thinking of embedded platforms where there is no or only very primitive linkers. There is also the advantage of producing similar output to GCC. Creating multiple output sections is not without risk - for example, it means that symbols can be re-ordered (relative to their original order in source files) which could cause a change in behaviour.

Yes, that's a very fair point. Of course, mergeability in general can cause this — but the fact that it can doesn't mean it does in practice.

The way forward now, I think, is to make a reasonable fix and see what breaks.

I completely agree.

John.

Just some more thoughts after being able to sleep on it.

The merge-able sections is generally handy for sections which I'm not specific about (in the added test case, the "implicit" sections). For example, if I have global const data that I don't take the address of and has internal linkage, then rather it be in .rodata, it's handy for LLVM to suffix it into separate .rodata.cst4, .rodata.cst8, etc. sections that ARE merge-able and have uniform entity sizes.

But, if I explicitly state that I want something (code or data) in a specific section, it would be surprising to me if the compiler changed the section I explicitly specified to put that in anything other than what I specified, and that included adding suffixes (like .cst4, .cst8, etc). Especially if I was manually performing relocations manually via libelf, or like the Linux kernel does. If I put data in .foo, and try to look it up at runtime explicitly, I might be surprised if it was moved to .foo.cst8, for example.

If I'm explicitly putting code or data in explicit sections and actually care about those sections being merge-able, than I'd better also be proficient enough in linker scripts to explicitly merge those sections together myself. Otherwise, I don't see GCC putting explicitly-sectioned data in merge-able sections, even with -fmerge-all-constants (which is already a dangerous flag to use) (GCC will put implicitly-sectioned data in merge-able if you specify -fmerge-all-constants, but that's not relevant for this patch which is about explicitly-sectioned code/data).

We're not talking about putting objects in a different section *in the image*. But ELF allows object files to contain an arbitrary number of what I've been calling "section units" that will be assembled into a single section in the image. In theory, we could even emit every object into a unique section unit (and in fact, we have to do that when the object is part of a COMDAT or has an associated symbol); it's just that doing so bloats the object file a bit (costing us compile time).

But ELF allows object files to contain an arbitrary number of what I've been calling "section units" that will be assembled into a single section in the image.

More precisely, in assembler, you can specify sections dis-jointly, but they will be rejoined when assembled into an ELF object, as ELF section names are unique and cannot be discontinuous.

But ELF allows object files to contain an arbitrary number of what I've been calling "section units" that will be assembled into a single section in the image.

More precisely, in assembler, you can specify sections dis-jointly, but they will be rejoined when assembled into an ELF object, as ELF section names are unique and cannot be discontinuous.

The ELF spec is explicit that files can contain multiple sections with the same name. This is necessary when working with groups but isn't restricted to that; for example, LLVM will also emit multiple sections for a single name+group pair when it has an associated section. The linker may join these in the image after it's done all the necessary processing, but I don't think it's actually required to.

As for what ELF assemblers actually support, that's of course a separate story. They presumably produce unique sections by at least name+group pairs, or else COMDATs will be totally broken. I don't know how the handling for associated sections works when LLVM emits to assembly vs. just producing the object file directly.

It's not abstractly unreasonable to also emit different sections based on mergeability and entry size. However, if doing so breaks current tools, that's not a reasonable option. The next best option would be to emit a single section per name+group, but only flag it mergeable if all the objects are identically mergeable. Unfortunately, I think MC isn't architected for that; the assembly writer wants to process globals one-by-one and can't retroactively change flags. So the final option is to stop trying to use ELF mergability for unnamed_addr globals entirely, which I think we all agree is undesirable.

I don't think any sort of frontend-based solution is reasonable. IR should be able to freely mark globals with sections and unnamed_addr, and it's the backend's responsibility to emit the best code it can for what's been written. If the current state of the rest of the toolchain means we can't pursue some optimization, so be it.

Sorry for the delay in updating this.

I hoped to post an updated patch today for this but I realized that I first need to investigate whether section relative references might interact badly with multiple sections with the same name.. I will be able to update the patch tomorrow.

Below is the code comment from the new patch explaining the new approach, please take a look and see if you have any questions/comments:

// If two globals with differing sizes end up in the same mergeable
// section that section can be assigned an incorrect entry size. Normally,
// the assembler avoids this by putting incompatible globals into
// differently named sections. However, globals can be explicitly assigned
// to a section by specifying the section name. In this case, if unique
// section names are available (-unique-section-names in LLVM) then we
// bin compatible globals into different mergeable sections with the same name.
// Otherwise, if incompatible globals have been explicitly assigned to section by a
// fine-grained/per-symbol mechanism (e.g. via  _attribute_((section(“myname”)))) then
// we issue an error and the user can then change the section assignment. If the
// section assignment is not via a fine-grained means (e.g. via pragma clang section)
// then we simply do not make the resulting section mergeable as there is nothing
// that the user can easily change to fix the resulting problem.

The solution described in that comment is not acceptable. We are not going to tell users that they cannot assign symbols explicitly to sections because LLVM is unable to promise not to miscompile if they do. It is LLVM's responsibility to correctly compile valid code; enabling mergeability for a section containing unnamed_addr symbols is an optimization, and if it is not safe, it needs to be disabled until we can figure out a way to make it safe.

I laid out a series of three options before:

  • Emit different object-file sections for different mergeability settings.
  • Only mark an object-file section as mergeable if all the symbols in it would have the same mergeability settings.
  • Stop implicitly using mergeability for "ordinary" sections (i.e. sections other than the string section).

Did you investigate these?

The solution described in that comment is not acceptable. We are not going to tell users that they cannot assign symbols explicitly to sections because LLVM is unable to promise not to miscompile if they do. It is LLVM's responsibility to correctly compile valid code; enabling mergeability for a section containing unnamed_addr symbols is an optimization, and if it is not safe, it needs to be disabled until we can figure out a way to make it safe.

Thanks for the feedback. Apologies that you had to repeat yourself.

I laid out a series of three options before:

  • Emit different object-file sections for different mergeability settings.
  • Only mark an object-file section as mergeable if all the symbols in it would have the same mergeability settings.
  • Stop implicitly using mergeability for "ordinary" sections (i.e. sections other than the string section).

Did you investigate these?

I looked into these today. I think we can do the first of these. I have put a WIP patch up here: https://reviews.llvm.org/D72194. Could you comment on the approach taken?

I looked into these today. I think we can do the first of these. I have put a WIP patch up here: https://reviews.llvm.org/D72194. Could you comment on the approach taken?

Thank you. Commented.

Below is the code comment from the new patch explaining the new approach, please take a look and see if you have any questions/comments:

// If two globals with differing sizes end up in the same mergeable
// section that section can be assigned an incorrect entry size. Normally,
// the assembler avoids this by putting incompatible globals into
// differently named sections. However, globals can be explicitly assigned
// to a section by specifying the section name. In this case, if unique
// section names are available (-unique-section-names in LLVM) then we
// bin compatible globals into different mergeable sections with the same name.

Looks good up to here.

// Otherwise, if incompatible globals have been explicitly assigned to section by a
// fine-grained/per-symbol mechanism (e.g. via  _attribute_((section(“myname”)))) then
// we issue an error and the user can then change the section assignment. If the

No bueno. The Linux kernel has code where there are 2D global arrays of different entity sizes explicitly placed in different sections (together). GCC does not error for this, (it doesn't mark the section merge-able), neither should we.

I laid out a series of three options before:

  • Emit different object-file sections for different mergeability settings.
  • Only mark an object-file section as mergeable if all the symbols in it would have the same mergeability settings.
  • Stop implicitly using mergeability for "ordinary" sections (i.e. sections other than the string section).

Of the above, I really think #2 is the only complete solution.

Below is the code comment from the new patch explaining the new approach, please take a look and see if you have any questions/comments:

// If two globals with differing sizes end up in the same mergeable
// section that section can be assigned an incorrect entry size. Normally,
// the assembler avoids this by putting incompatible globals into
// differently named sections. However, globals can be explicitly assigned
// to a section by specifying the section name. In this case, if unique
// section names are available (-unique-section-names in LLVM) then we
// bin compatible globals into different mergeable sections with the same name.

Looks good up to here.

// Otherwise, if incompatible globals have been explicitly assigned to section by a
// fine-grained/per-symbol mechanism (e.g. via  _attribute_((section(“myname”)))) then
// we issue an error and the user can then change the section assignment. If the

No bueno. The Linux kernel has code where there are 2D global arrays of different entity sizes explicitly placed in different sections (together). GCC does not error for this, (it doesn't mark the section merge-able), neither should we.

I laid out a series of three options before:

  • Emit different object-file sections for different mergeability settings.
  • Only mark an object-file section as mergeable if all the symbols in it would have the same mergeability settings.
  • Stop implicitly using mergeability for "ordinary" sections (i.e. sections other than the string section).

Of the above, I really think #2 is the only complete solution.

Can you explain why you think #1 is not "complete"? All three seem to establish correctness; I can see how giving up on the optimization (#3) is not a great solution, but #1 doesn't have that problem and actually preserves it in more places. To be clear, this is just taking advantage of the ability to have multiple sections with the same name in an ELF object file; they will still be assembled into a single section in the linked image.

My understanding is that changing the flags on an MCSection retroactively is pretty counter to the architecture, so I'm not sure how we'd actually achieve #2.

I laid out a series of three options before:

  • Emit different object-file sections for different mergeability settings.
  • Only mark an object-file section as mergeable if all the symbols in it would have the same mergeability settings.
  • Stop implicitly using mergeability for "ordinary" sections (i.e. sections other than the string section).

Of the above, I really think #2 is the only complete solution.

Can you explain why you think #1 is not "complete"? All three seem to establish correctness; I can see how giving up on the optimization (#3) is not a great solution, but #1 doesn't have that problem and actually preserves it in more places. To be clear, this is just taking advantage of the ability to have multiple sections with the same name in an ELF object file; they will still be assembled into a single section in the linked image.

My understanding is that changing the flags on an MCSection retroactively is pretty counter to the architecture, so I'm not sure how we'd actually achieve #2.

Ah, ok, I reread https://reviews.llvm.org/D72194 and see that it's creating non-contiguous sections (option 1), with unique entity sizes. That should be fine. Dissenting opinion retracted. We should prefer https://reviews.llvm.org/D72194 with the commit message updated.

Thanks for the feedback. I'm sorry that I am being slow on this... v.busy at the moment. Will try to finish the patch in the next few days.

I laid out a series of three options before:

  • Emit different object-file sections for different mergeability settings.
  • Only mark an object-file section as mergeable if all the symbols in it would have the same mergeability settings.
  • Stop implicitly using mergeability for "ordinary" sections (i.e. sections other than the string section).

Of the above, I really think #2 is the only complete solution.

Can you explain why you think #1 is not "complete"? All three seem to establish correctness; I can see how giving up on the optimization (#3) is not a great solution, but #1 doesn't have that problem and actually preserves it in more places. To be clear, this is just taking advantage of the ability to have multiple sections with the same name in an ELF object file; they will still be assembled into a single section in the linked image.

My understanding is that changing the flags on an MCSection retroactively is pretty counter to the architecture, so I'm not sure how we'd actually achieve #2.

Ah, ok, I reread https://reviews.llvm.org/D72194 and see that it's creating non-contiguous sections (option 1), with unique entity sizes. That should be fine. Dissenting opinion retracted. We should prefer https://reviews.llvm.org/D72194 with the commit message updated.

I have updated the commit message and patch on https://reviews.llvm.org/D72194.

I am abandoning this in favor of https://reviews.llvm.org/D72194.

bd1976llvm abandoned this revision.Jan 13 2020, 5:33 PM
llvm/test/CodeGen/X86/explict-section-mergeable.ll