This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Simplify SHT_NOBITS -> SHT_PROGBITS promotion
ClosedPublic

Authored by MaskRay on Apr 3 2019, 3:05 AM.

Details

Summary

GNU objcopy uses bfd_elf_get_default_section_type to decide the candidate section type,
which roughly translates to our [a] (I assume SEC_COMMON implies SHF_ALLOC):

(!(Sec.Flags & ELF::SHF_ALLOC) || Flags & (SectionFlag::SecContents | SectionFlag::SecLoad)))

Then, it updates the section type in bfd/elf.c:elf_fake_sections if:

if (this_hdr->sh_type == SHT_NULL)
  this_hdr->sh_type = sh_type; // common case
else if (this_hdr->sh_type == SHT_NOBITS
         && sh_type == SHT_PROGBITS
         && (asect->flags & SEC_ALLOC) != 0)  // uncommon case
  ...
  this_hdr->sh_type = sh_type;

If the following condition is met the uncommon branch is executed:

  if (elf_section_type (osec) == SHT_NULL
      && (osec->flags == isec->flags
	  || (final_link
	      && ((osec->flags ^ isec->flags)
		  & ~(SEC_LINK_ONCE | SEC_LINK_DUPLICATES | SEC_RELOC)) == 0)))

I suggest we just ignore this clause and follow the common case
behavior, which is done in this patch. Rationales to do so:

If --set-section-flags is a no-op (osec->flags == isec->flags)
(corresponds to the "readonly" test in set-section-flags.test), GNU
objcopy will require (Sec.Flags & ELF::SHF_ALLOC). [a] is essentially:

Flags & (SectionFlag::SecContents | SectionFlag::SecLoad)

This special case is not really useful. Non-SHF_ALLOC SHT_NOBITS
sections do not make much sense and it doesn't matter if they are
SHT_NOBITS or SHT_PROGBITS.

For all other RUN lines in set-section-flags.test, the new behavior
matches GNU objcopy, i.e. this patch improves compatibility.

Diff Detail

Repository
rL LLVM

Event Timeline

MaskRay created this revision.Apr 3 2019, 3:05 AM
MaskRay updated this revision to Diff 193462.Apr 3 2019, 3:13 AM

Update comment

I feel somewhat ambivalent about this change. The rationale makes sense (great patch description!), but these section flags provide GNU objcopy compatibility, so I'm not sure making it do things the "expected" way is actually helpful -- we should just provide separate flags if we want to do things the right way. (e.g. expose "sht_progbits" as a command line flag).

About the rule used in this patch: (!(Sec.Flags & ELF::SHF_ALLOC) || Flags & (SectionFlag::SecContents | SectionFlag::SecLoad)))

!(Sec.Flags & ELF::SHF_ALLOC)

The rationale is in the description.

SectionFlag::SecContents

In binutils-gdb/bfd/sections.c, the comment of SEC_HAS_CONTENTS is ".The section has contents - a data section could be . <<SEC_ALLOC>> | <<SEC_HAS_CONTENTS>>; a debug section could be . <<SEC_HAS_CONTENTS>>". This bit corresponds to our SecContents. If the user puts some stuff to the section, it can no longer be SHT_NOBITS (gABI says "A section of type SHT_NOBITS may have a non-zero size, but it occupies no space in the file.") This bit makes much sense to promote SHT_NOBITS -> SHT_PROGBITS.

SectionFlag::SecLoad

The comment of SEC_LOAD is "Tells the OS to load the section from the file when loading. This is clear for a .bss section." .bss is of SHT_NOBITS. I feel this rule still makes sense, but not much as SEC_HAS_CONTENTS does.

The other bits SectionFlag::SecCode | SectionFlag::SecData | ... make less sense thus they are remove from this patch.

Since llvm-objcopy contributors are adding more and more GNU objcopy options to llvm-objcopy, I feel people care about its compatibility with GNU objcopy. My preference is actually to keep just SectionFlag::SecContents and drop the other two, if I'm allowed to do so. This would certainly harm compatibility with GNU objcopy. What do people think?

Since llvm-objcopy contributors are adding more and more GNU objcopy options to llvm-objcopy, I feel people care about its compatibility with GNU objcopy. My preference is actually to keep just SectionFlag::SecContents and drop the other two, if I'm allowed to do so. This would certainly harm compatibility with GNU objcopy. What do people think?

Do you mean removing for e.g. --set-section-flags=data or whatever? I'm against dropping support for any flags that were added to be compatible with GNU, as most of them have been added with good reason, or are being relied on by end users to do what is expected. Certainly, I've run into some users of these switches, though I don't remember exactly which values.

MaskRay added a comment.EditedApr 4 2019, 1:54 AM

Since llvm-objcopy contributors are adding more and more GNU objcopy options to llvm-objcopy, I feel people care about its compatibility with GNU objcopy. My preference is actually to keep just SectionFlag::SecContents and drop the other two, if I'm allowed to do so. This would certainly harm compatibility with GNU objcopy. What do people think?

Do you mean removing for e.g. --set-section-flags=data or whatever? I'm against dropping support for any flags that were added to be compatible with GNU, as most of them have been added with good reason, or are being relied on by end users to do what is expected. Certainly, I've run into some users of these switches, though I don't remember exactly which values.

No. I mean

if (Sec.Type == SHT_NOBITS && Flags & SectionFlag::SecContents)
  Sec.Type = SHT_PROGBITS;

The current patch uses

if (Sec.Type == SHT_NOBITS &&
    (!(Sec.Flags & ELF::SHF_ALLOC) ||
     Flags & (SectionFlag::SecContents | SectionFlag::SecLoad)))
  Sec.Type = SHT_PROGBITS;

which has better compatibility with GNU objcopy than the existing code.

I don't feel strongly about this change either way. I think it is correct in the sense that it does what it sets out to achieve, but I have no opinion on whether we should aim for compatibility here. I do agree that non-alloc nobits sections makes no sense, but could there be a crazy build system where somebody modifies the flags in different runs, relying on the later run to add the SHF_ALLOC section?

tools/llvm-objcopy/ELF/ELFObjcopy.cpp
104 ↗(On Diff #193462)

I'd put "sections" between "SHT_NOBITS" and "do" here.

MaskRay updated this revision to Diff 193729.Apr 4 2019, 8:41 AM

put "sections" between "SHT_NOBITS" and "do"

MaskRay marked an inline comment as done.Apr 4 2019, 8:41 AM

I haven't bread or considered this patch. I just want to clarify when I think we should make changes of this type.

We should do things "the right way" at first and then comply more closely to GNU objcopy as real world cases come up. We should make minimal comparability attempts. Effectively think of it as "premature comparability isn't he root of all evil". We can optimize for compatibility when the case comes up.

As for working backwards from where we may have over optimized in the past and doing something more sensible now, that is kind of harder. To make such a change we need to know that we're not going to break anyone. Without evidence for that we have to reject such changes.

If Jordan can build all of Google's code with such a change then that is strong evidence that this is ok. Any other large or complicated code base would also be valid I think. These kinds of checks are very expensive however some should only request such evidence if the use case is very compelling.

I haven't bread or considered this patch. I just want to clarify when I think we should make changes of this type.

We should do things "the right way" at first and then comply more closely to GNU objcopy as real world cases come up. We should make minimal comparability attempts. Effectively think of it as "premature comparability isn't he root of all evil". We can optimize for compatibility when the case comes up.

As for working backwards from where we may have over optimized in the past and doing something more sensible now, that is kind of harder. To make such a change we need to know that we're not going to break anyone. Without evidence for that we have to reject such changes.

If Jordan can build all of Google's code with such a change then that is strong evidence that this is ok. Any other large or complicated code base would also be valid I think. These kinds of checks are very expensive however some should only request such evidence if the use case is very compelling.

FWIW, my original change to promote from NOBITS to PROGBITS for this flag wasn't caught by any internal code -- I think we are only using these flags in situations where the section is already PROGBITS anyway. It was caught by running the gnu binutils test suite against llvm-objcopy. My latest round (which has been a while) didn't show anything related to this area.

This changes seems pretty reasonable to me, but this gets into the ugly area of how close to GNU compatibility do we need to get. If someone is relying on GNU objcopy behavior to promote a NOBITS section to PROGBITS section only by setting the noload flag, then this change will break them, but that also seems like an extremely obscure/broken use of this tool that we shouldn't support. But I haven't done much investigation, hence I'm hoping someone else has an opinion :) if not I can take a more thorough look later

This changes seems pretty reasonable to me, but this gets into the ugly area of how close to GNU compatibility do we need to get.

That was exactly why I asked in a previous comment about your preference :)

Current patch:

if (Sec.Type == SHT_NOBITS &&
    (!(Sec.Flags & ELF::SHF_ALLOC) ||
     Flags & (SectionFlag::SecContents | SectionFlag::SecLoad)))
  Sec.Type = SHT_PROGBITS;

Alternative:

if (Sec.Type == SHT_NOBITS && Flags & SectionFlag::SecContents)
  Sec.Type = SHT_PROGBITS;

I thought about the rationale of the GNU objcopy behavior and put that in https://reviews.llvm.org/D60189#1454342

I'm trying to think where a NOBITS/PROGBITS distinction actually matters. Some relevant thoughts come to my mind:

  1. Where sections are in the middle of segments, they should always be PROGBITS, not NOBITS, otherwise addresses will get messed up (there is a special exception for .tbss and nested PT_TLS segments).
  2. If a section is PROGBITS when it doesn't need to be, then it will take up file space. Assuming it is filled with zeroes, this is harmless in terms of how the ELF behaves, and just impacts things like file size and probably load times.
  3. Converting a PROGBITS section to NOBITS can be harmful if the contents are important, as those contents will be lost.
  4. Non-alloc NOBITS sections are meaningless. Promoting them to PROGBITS will bloat the file size but otherwise have no effect.
  5. If there is a GNU binutils test for it, there's probably some good reason for any behaviour that I haven't thought of, so we should match GNU's behaviour, unless it is clearly dodgy.

Points 1-4 indicate to me that promoting NOBITS to PROGBITS under more cases is probably fine, as long as point 5 is observed.

MaskRay added a comment.EditedApr 17 2019, 7:19 PM

Then I'll stick with the current form (it is what the patch does):

if (Sec.Type == SHT_NOBITS &&
    (!(Sec.Flags & ELF::SHF_ALLOC) ||
     Flags & (SectionFlag::SecContents | SectionFlag::SecLoad)))

It is simpler and provides better compatibility with GNU objcopy than the current rule.

jhenderson accepted this revision.Apr 30 2019, 9:24 AM

LGTM, but you should get one of the others to sign off too.

This revision is now accepted and ready to land.Apr 30 2019, 9:24 AM
rupprecht accepted this revision.Apr 30 2019, 12:09 PM

I think this is a reasonable change -- it isn't strictly matching GNU objcopy in my tests, but we can update it if we find out this edge case is important.

This revision was automatically updated to reflect the committed changes.