This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Fix early overflow check in finalizeAddressDependentContent
ClosedPublic

Authored by andcarminati on Jun 5 2023, 8:15 AM.

Details

Summary

LLD terminates with errors when it detects overflows in the
finalizeAddressDependentContent calculation. Although, sometimes, those errors
are not really errors, but an intermediate result of an ongoing address
calculation. If we continue the fixed-point algorithm we can converge to the
correct result.

This patch

  • Removes the verification inside the fixed point algorithm.
  • Calls checkMemoryRegions at the end.

Diff Detail

Event Timeline

andcarminati created this revision.Jun 5 2023, 8:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2023, 8:15 AM
andcarminati requested review of this revision.Jun 5 2023, 8:15 AM

May I suggest an alternative approach? Instead of checking the memory regions when assigning addresses, move the check to its own function, something like verifyMemoryRegions() this could be called once after finalizeAddressDependent Content. It would require fewer changes to the code and would be working on final addresses.

One possible disadvantage of this approach, but I think that they are likely shared with suppressing the errors on the first pass, is that the contents of a memory region could spill into the next one. This could lead to overlapping addresses which may cause a different, more difficult to understand, error message. I don't think that this will be the case today, but is a possibility.

lld/ELF/LinkerScript.h
328

boolean values threaded through the code can make the code difficult to understand when looking at the callsite. True or false implies some change in behaviour but not what it is.

If this has to be threaded through I suggest using an enum with a descriptive name.

lld/ELF/Writer.cpp
1603

This is likely sufficient for most cases, but could there be others that require more than one pass to converge?

lld/test/ELF/linkerscript/end-overflow-check.test
64

Are all of these symbol assignments required to trigger the error? If not, could this be cut down to a minimal set?

andcarminati edited the summary of this revision. (Show Details)

Comments were addressed.

May I suggest an alternative approach? Instead of checking the memory regions when assigning addresses, move the check to its own function, something like verifyMemoryRegions() this could be called once after finalizeAddressDependent Content. It would require fewer changes to the code and would be working on final addresses.

One possible disadvantage of this approach, but I think that they are likely shared with suppressing the errors on the first pass, is that the contents of a memory region could spill into the next one. This could lead to overlapping addresses which may cause a different, more difficult to understand, error message. I don't think that this will be the case today, but is a possibility.

Hi Peter, thank you for the comments, I followed your suggestion of one single function at the end.

andcarminati added inline comments.Jun 6 2023, 4:59 AM
lld/test/ELF/linkerscript/end-overflow-check.test
64

Hi, until this moment I couldn't reduce more than it (it was originally an arm application). The root cause is:

REGION1__SR_SHIFT = REGION1__REGION_SHIFT_WITH_SR - 3;
`

Thanks for the update. The code changes look reasonable to me. I'd ideally like to give a suggestion on how to simplify the test but I'm a bit short on time at the moment. Will try to come back soon. With luck one of the other reviewers will have a suggestion.

lld/test/ELF/linkerscript/end-overflow-check.test
64

That's unfortunate, at a cursory glance it looked like some of these wouldn't be needed.

It looks like the key symbols are REGION1PRE_ALIGNMENT and REGION1POST_ALIGNMENT is it possible to make an example that still tests the change in behaviour but is much simpler, even if the example is contrived?

The example as it stands is difficult to follow and I'm not sure what the answer is supposed to be without tracing it through to make sure I understand it.

*The code was improved.
*The test was simplified.

Thanks for the update. The code changes look reasonable to me. I'd ideally like to give a suggestion on how to simplify the test but I'm a bit short on time at the moment. Will try to come back soon. With luck one of the other reviewers will have a suggestion.

Hi @peter.smith, you are right. I reduced by half the expressions by removing everything that converged in the first round. What do you think?

Thanks for the updates. I've made a small suggestion to simplify the code.

lld/ELF/LinkerScript.cpp
1454

A suggestion that simplifies the code a little bit:

static void verifyMemoryRegion(const MemoryRegion *memoryRegion,
                               const OutputSection *osec, uint64_t addr) {
  uint64_t osecEnd = addr + osec->size;
  uint64_t regionEnd = memoryRegion->getOrigin() + memoryRegion->getLength();

  if (osecEnd > regionEnd) {
    error("section '" + osec->name + "' will not fit in region '" +
          memoryRegion->name + "': overflowed by " +
          Twine(osecEnd - regionEnd) + " bytes");
  }
}

void LinkerScript::verifyMemoryRegions() const {
  for (const OutputSection *sec : outputSections) {
    if (const MemoryRegion *memoryRegion = sec->memRegion)
      verifyMemoryRegion(memoryRegion, sec, sec->addr);
    if (const MemoryRegion *lmaRegion = sec->lmaRegion)
      verifyMemoryRegion(lmaRegion, sec, sec->getLMA());
  }
}

All comments were addressed.

Thanks for the updates. I've made a small suggestion to simplify the code.

Hi @peter.smith, thank you for this suggestion, it looks better now.

peter.smith accepted this revision.Jun 8 2023, 10:21 AM

Looks good to me, thanks for the update. Will be worth waiting to see if MaskRay has any comments.

This revision is now accepted and ready to land.Jun 8 2023, 10:21 AM
andcarminati marked 5 inline comments as done.Jun 9 2023, 7:34 AM
MaskRay requested changes to this revision.Jun 9 2023, 1:38 PM

Thanks. The code looks good but the test needs simplification.

lld/ELF/LinkerScript.cpp
1456

delete the blank line after function argument list.

1459

delete this blank line as well.

lld/test/ELF/linkerscript/end-overflow-check.test
4

%ts is uncommon. rm -rf %t && split-file %s %t with optional && cd %t.

12

Omit ld.lld: for error messages.

48

Seems unused?

69

delete this blank line

85

Assigning to . with . = in an output section description has different semantics in ld.lld (and IMO GNU ld's semantics are inconsistent in some cases). Better to avoid the case. Use _abss ALIGN(...) :

This revision now requires changes to proceed.Jun 9 2023, 1:38 PM

All MaskRay's comments were addressed.

andcarminati marked 6 inline comments as done.Jun 12 2023, 1:34 AM
MaskRay accepted this revision.Jun 12 2023, 5:47 PM

Some nits

lld/ELF/LinkerScript.cpp
1454

Just name the parameter region.

lld/ELF/LinkerScript.h
342

I think this is "Verify" instead of "Verify for". Actually, checkMemoryRegions is better as we have checkSections.

lld/test/ELF/linkerscript/end-overflow-check.test
7
75

I want to request further cleanup, but having some moderately complex tests in the testsuite seems fine as well to ensure that we can parse these constructs.

This revision is now accepted and ready to land.Jun 12 2023, 5:47 PM

All MaskRay's comments were addressed.

andcarminati marked 5 inline comments as done.Jun 13 2023, 1:54 AM
andcarminati added a comment.EditedJun 13 2023, 1:59 AM

If there are no more comments, can you kindly commit this change, when you have time @MaskRay ?

Author's detail:

git commit --amend --author='Andreu Carminati <andreu.carminati@hightec-rt.com>'

Thank you!

run git-clang-format

MaskRay updated this revision to Diff 531543.Jun 14 2023, 3:25 PM
MaskRay edited the summary of this revision. (Show Details)

format summary

Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2023, 3:25 PM
This revision was landed with ongoing or failed builds.Jun 14 2023, 3:26 PM
This revision was automatically updated to reflect the committed changes.

Looks like this or D151802 broke the ubsan ARM bot: https://lab.llvm.org/buildbot/#/builders/238/builds/3938. Could you please revert or fix?

Thanks!

Looks like this or D151802 broke the ubsan ARM bot: https://lab.llvm.org/buildbot/#/builders/238/builds/3938. Could you please revert or fix?

Thanks!

It would be this one. I suspect that it is the test rather than the code change:

REGION1__PADDED_SR_SIZE = MAX(1 << REGION1__PADDED_SR_SHIFT, 32);

Where REGION1_PADDED_SR_SIZE isn't known till the second pass.

I think in the short term the test could be modified to clamp REGION1__PADDED_SR_SHIFT to an in range value. More thought would need to be given to whether LLD can protect itself against out of range shifts coming from scripts. I expect this could get fixed within a day assuming the author or MaskRay is around, it is the end of the day here and I need to leave in 5 minutes so I'm not in a good place to take action myself.

Looks like this or D151802 broke the ubsan ARM bot: https://lab.llvm.org/buildbot/#/builders/238/builds/3938. Could you please revert or fix?

Thanks!

It would be this one. I suspect that it is the test rather than the code change:

REGION1__PADDED_SR_SIZE = MAX(1 << REGION1__PADDED_SR_SHIFT, 32);

Where REGION1_PADDED_SR_SIZE isn't known till the second pass.

I think in the short term the test could be modified to clamp REGION1__PADDED_SR_SHIFT to an in range value. More thought would need to be given to whether LLD can protect itself against out of range shifts coming from scripts. I expect this could get fixed within a day assuming the author or MaskRay is around, it is the end of the day here and I need to leave in 5 minutes so I'm not in a good place to take action myself.

Thanks to both for the report and analysis. I pushed daba24ee7bd3d80f085e2e5f057af6b873a78a24 to fix the test.

Looks like this or D151802 broke the ubsan ARM bot: https://lab.llvm.org/buildbot/#/builders/238/builds/3938. Could you please revert or fix?

Thanks!

It would be this one. I suspect that it is the test rather than the code change:

REGION1__PADDED_SR_SIZE = MAX(1 << REGION1__PADDED_SR_SHIFT, 32);

Where REGION1_PADDED_SR_SIZE isn't known till the second pass.

I think in the short term the test could be modified to clamp REGION1__PADDED_SR_SHIFT to an in range value. More thought would need to be given to whether LLD can protect itself against out of range shifts coming from scripts. I expect this could get fixed within a day assuming the author or MaskRay is around, it is the end of the day here and I need to leave in 5 minutes so I'm not in a good place to take action myself.

Thank you for the analysis @peter.smith.

Build is green now. Thanks everyone!