This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Error if the linked-to section of a SHF_LINK_ORDER section is discarded
ClosedPublic

Authored by MaskRay on Sep 19 2019, 5:49 AM.

Details

Summary

If st_link(A)=B, and A has the SHF_LINK_ORDER flag, we may dereference
a null pointer if B is garbage collected (PR43147):

  1. In Wrter.cpp:compareByFilePosition, aOut->sectionIndex or bOut->sectionIndex
  2. In OutputSections::finalize, d->getParent()->sectionIndex

Simply error and bail out to avoid null pointer dereferences. ld.bfd has
a similar error:

sh_link of section `.bar' points to discarded section `.foo0' of `a.o'

ld.bfd is more permissive in that it just checks whether the linked-to
section of the first input section is discarded. This is likely because
it sets sh_link of the output section according to the first input
section.

Diff Detail

Repository
rL LLVM

Event Timeline

MaskRay created this revision.Sep 19 2019, 5:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 19 2019, 5:49 AM

Thanks for the update. I'm happy with this approach. Will be good to hear other opinions as well.

grimar added inline comments.Sep 19 2019, 6:58 AM
ELF/Writer.cpp
1527 ↗(On Diff #220846)

Should we report this much earlier, on GC stage, in MarkLive.cpp for example?

I do not have a chance to check it today unfortunately, but looks it could be done somewhere here:

// Report garbage-collected sections.
if (config->printGcSections)
  for (InputSectionBase *sec : inputSections)
    if (!sec->isLive())
      message("removing unused section " + toString(sec));

?

MaskRay marked an inline comment as done.Sep 19 2019, 8:15 AM
MaskRay added inline comments.
ELF/Writer.cpp
1527 ↗(On Diff #220846)

markLive() is called before processSectionCommands(). Before processSectionCommands(), output sections are not allocated, and the link->getParent() check does not work.

We can change link->getParent() to link->isLive() if we really want to perform the check in markLive(). However, we have another rule: /DISCARD/ discards an input section as well as its dependentSections. If sh_link(a)=b && a has SHF_LINK_ORDER, we currently place a in b's dependentSections list. If we discard b, a will be discarded as well (see discard-section-metadata.s). Having the check in markLive() would issue a premature error.

For the reasons above, we cannot perform the check in markLive().

Actually, I'm thinking if we can/should drop the rule:

If sh_link(a)=b && a has SHF_LINK_ORDER, we currently place a in b's dependentSections list

I need to check for tests and previous commits on this.

If sh_link(a)=b && a has SHF_LINK_ORDER, we currently place a in b's dependentSections list

I need to check for tests and previous commits on this.

If we drop the rule, we'll have to tune the SHF_LINK_ORDER rule a bit in MarkLive.cpp. As a plus(?), we'll improve compatibility with ld.bfd.

For linkerscript/discard-section-metadata.s (4 sections formed as an rooted tree via sh_link), ld.bfd errors:

% ld.bfd -T a.script a -o b
ld.bfd: b: sh_link of section `.bar' points to discarded section `.foo' of `a'

In the SHF_LINK_ORDER area, we seem to have some extensions that haven't been discussed with GNU developers. I am happy with the extensions but I think GNU developers should be informed.

  1. non-alloc SHF_LINK_ORDER https://sourceware.org/bugzilla/show_bug.cgi?id=25021
  2. If A has the SHF_LINK_ORDER flag && sh_link(A)=B, retained B implies retained A. The converse is not clear. https://sourceware.org/bugzilla/show_bug.cgi?id=24526 posted in May but no comments yet.

If sh_link(a)=b && a has SHF_LINK_ORDER, we currently place a in b's dependentSections list

I need to check for tests and previous commits on this.

If we drop the rule, we'll have to tune the SHF_LINK_ORDER rule a bit in MarkLive.cpp. As a plus(?), we'll improve compatibility with ld.bfd.

For linkerscript/discard-section-metadata.s (4 sections formed as an rooted tree via sh_link), ld.bfd errors:

% ld.bfd -T a.script a -o b
ld.bfd: b: sh_link of section `.bar' points to discarded section `.foo' of `a'

Switching to ld.bfd behavior looks OK to me. It is a bit more strict and would simplify this patch it seems.
Interesting what other people think.

If sh_link(a)=b && a has SHF_LINK_ORDER, we currently place a in b's dependentSections list

I need to check for tests and previous commits on this.

If we drop the rule, we'll have to tune the SHF_LINK_ORDER rule a bit in MarkLive.cpp. As a plus(?), we'll improve compatibility with ld.bfd.

For linkerscript/discard-section-metadata.s (4 sections formed as an rooted tree via sh_link), ld.bfd errors:

% ld.bfd -T a.script a -o b
ld.bfd: b: sh_link of section `.bar' points to discarded section `.foo' of `a'

I think that LLD is being more helpful in this case. If we don't discard the dependent sections when the dependee is discarded the user is guaranteed to get an error message, effectively telling them to add another explicit discard. The only benefit I can think of for the error message is that the user might think again about discarding the section. On the whole I don't think it matters that much though as this doesn't seem to come up a lot in practice with ld.bfd. Probably due to the use of /DISCARD/ tending to be used in environments like embedded systems where metadata tends not to be used as much.

In the SHF_LINK_ORDER area, we seem to have some extensions that haven't been discussed with GNU developers. I am happy with the extensions but I think GNU developers should be informed.

  1. non-alloc SHF_LINK_ORDER https://sourceware.org/bugzilla/show_bug.cgi?id=25021
  2. If A has the SHF_LINK_ORDER flag && sh_link(A)=B, retained B implies retained A. The converse is not clear. https://sourceware.org/bugzilla/show_bug.cgi?id=24526 posted in May but no comments yet.

I think 2 is required for SHT_ARM_EXIDX sections, it looks like ld.bfd and ld.gold special case this, whereas LLD has generalised this to all SHF_LINK_ORDER sections. The ELF spec isn't particularly clear in this area so I'm comfortable with the LLD behaviour in both cases.

If sh_link(a)=b && a has SHF_LINK_ORDER, we currently place a in b's dependentSections list

I need to check for tests and previous commits on this.

If we drop the rule, we'll have to tune the SHF_LINK_ORDER rule a bit in MarkLive.cpp. As a plus(?), we'll improve compatibility with ld.bfd.

For linkerscript/discard-section-metadata.s (4 sections formed as an rooted tree via sh_link), ld.bfd errors:

% ld.bfd -T a.script a -o b
ld.bfd: b: sh_link of section `.bar' points to discarded section `.foo' of `a'

I think that LLD is being more helpful in this case. If we don't discard the dependent sections when the dependee is discarded the user is guaranteed to get an error message, effectively telling them to add another explicit discard. The only benefit I can think of for the error message is that the user might think again about discarding the section. On the whole I don't think it matters that much though as this doesn't seem to come up a lot in practice with ld.bfd. Probably due to the use of /DISCARD/ tending to be used in environments like embedded systems where metadata tends not to be used as much.

Thanks for sharing your opinion. I take this as a convenience method for users who write /DISCARD/: dropping a section discards all SHF_LINK_ORDER sections that link to it. Handling SHF_LINK_ORDER the same way as SHT_REL[A] seem consistent. I'll file another feature request to ld.bfd on this.

In the SHF_LINK_ORDER area, we seem to have some extensions that haven't been discussed with GNU developers. I am happy with the extensions but I think GNU developers should be informed.

  1. non-alloc SHF_LINK_ORDER https://sourceware.org/bugzilla/show_bug.cgi?id=25021
  2. If A has the SHF_LINK_ORDER flag && sh_link(A)=B, retained B implies retained A. The converse is not clear. https://sourceware.org/bugzilla/show_bug.cgi?id=24526 posted in May but no comments yet.

I think 2 is required for SHT_ARM_EXIDX sections, it looks like ld.bfd and ld.gold special case this, whereas LLD has generalised this to all SHF_LINK_ORDER sections. The ELF spec isn't particularly clear in this area so I'm comfortable with the LLD behaviour in both cases.

+1

Ok. A suggestion about how to simplify it a little bit is below.

ELF/Writer.cpp
1527 ↗(On Diff #220846)

What about the following reordering of the code here then?

// The ARM.exidx section use SHF_LINK_ORDER, but we have consolidated
// this processing inside the ARMExidxsyntheticsection::finalizeContents().
if (!config->relocatable && config->emachine == EM_ARM &&
    sec->type == SHT_ARM_EXIDX)
  continue;

// Link order may be distributed across several InputSectionDescriptions
// but sort must consider them all at once.
std::vector<InputSection **> scriptSections;
std::vector<InputSection *> sections;
for (BaseCommand *base : sec->sectionCommands) {
  if (auto *isd = dyn_cast<InputSectionDescription>(base)) {
    for (InputSection *&isec : isd->sections) {
      scriptSections.push_back(&isec);
      sections.push_back(isec);

      InputSection *link = isec->getLinkOrderDep();
      if (!link->getParent())
        error(toString(isec) + ": sh_link points to discarded section " +
              toString(link));
    }
  }
}

if (errorCount())
  continue;
MaskRay updated this revision to Diff 221025.Sep 20 2019, 7:06 AM
MaskRay marked 2 inline comments as done.

feedback

grimar accepted this revision.Sep 20 2019, 7:09 AM

LGTM

This revision is now accepted and ready to land.Sep 20 2019, 7:09 AM
MaskRay updated this revision to Diff 221031.Sep 20 2019, 7:58 AM
MaskRay edited the summary of this revision. (Show Details)

Update description

This revision was automatically updated to reflect the committed changes.