This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Do not remove empty sections referenced in LOADADDR/ADDR commands.
ClosedPublic

Authored by grimar on Nov 16 2018, 2:28 AM.

Details

Summary

This is https://bugs.llvm.org//show_bug.cgi?id=38750.

If script references empty sections in LOADADDR/ADDR commands

.empty  : { *(.empty ) }
.text   : AT(LOADADDR (.empty) + SIZEOF (.empty)) { *(.text) }

then an empty section will be removed and LOADADDR/ADDR will evaluate to null.
It is not that user may expect from using of the generic script, what is a common case.

There is a trivial fix for that, so I would do that.

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

grimar created this revision.Nov 16 2018, 2:28 AM
ruiu added a comment.Nov 29 2018, 11:12 AM

I really don't like to add more code to the linker script support to make it more and more compatible with GNU, as there is not official language spec and there will always be an incompatibility. How much important is this patch? I think you should discuss that more before start trying to fix. We should not try to "fix" all the reported "issues".

grimar added a reviewer: psmith.EditedNov 30 2018, 1:07 AM

+Peter, interesting to know his opinion about how much useful this one might be (or not) for embedded soft.

We already don't remove an empty section if its expression contains symbols. It seems consistent step to keep the referenced sections as well ten.
We do produce a broken output here now. Personally, I am not a user of tricky linker script features, but I think I am observing such constructions like
AT(ADDR (.section_name) + 0xABC) often in scripts in the wild. There is a difference in how do we handle the empty sections in LLD with what bfd do.
Some differences probably will never be fixed, but some can be fixed easily. I do not understand why we would want to leave the situation (fixed by this patch) "as is" when we:

  1. Already spent some time on fixing the others different cases relative to empty sections to improve consistency with the bfd.
  2. Silently produce a broken output now and reporting the error would be more complicated than actual fix perhaps.
  3. Have a user who faced and reported the issue (and an unknown number of people who did not yet) and a patch that is trivial and consistent with the overall LLD story about empty sections.

I think that this type of script is common in embedded systems. A company will have a product family represented by a common core + variable feature sets, they often have a single linker script that is shared between the projects. This often results in some Output Sections being empty for some of the products. On larger projects it can be difficult to know when this will occur, for example all it takes is one developer to remove the use of a library to silently break a working program.

For this particular example, I don't think it is necessary to keep the .empty section in the ELF file; however I think that it is important for the calculation of the addresses of .text and .data to be either correct or we give an error message. It may be that forcibly keeping .empty is the easiest way of achieving that. An error message is not particularly friendly but it would at least alert the user that they would need to add zero sized sections or symbols to make sure the section was kept.

For this particular example, I don't think it is necessary to keep the .empty section in the ELF file; however I think that it is important for the calculation of the addresses of .text and .data to be either correct or we give an error message. It may be that forcibly keeping .empty is the easiest way of achieving that. An error message is not particularly friendly but it would at least alert the user that they would need to add zero sized sections or symbols to make sure the section was kept.

Thanks, Peter! I suddenly noticed that seems it would be not hard to report an error instead. I'll try to prepare the alternative patch doing that to check what is needed.

This could also occur with ALIGNOF and SIZEOF. It looks like Script->getOrCreateSectionName() is only called when a part of a script references another, whereas SECTIONS uses Script->createSectionName(). Might it be better to set ForceKeepIfEmpty in that function.

This could also occur with ALIGNOF and SIZEOF. It looks like Script->getOrCreateSectionName() is only called when a part of a script references another, whereas SECTIONS uses Script->createSectionName(). Might it be better to set ForceKeepIfEmpty in that function.

Note, I have SIZEOF in the test.

I tried to isolate and minimize the effect, so to keep the empty section only when/if it is really needed.
ALIGNOF and SIZEOF would return 0 for an empty removed section, that seems to be OK just to ignore them.

I think that this change is worth making as it is a reasonable to expect ADDR and LOADADDR to return a valid result even when empty. I note that with wildcards the author may not know ahead of time whether the OutputSection will be empty or not so this doesn't just affect people intentionally using empty OutputSections.

I've take another look and have made some suggestions to make it easier to understand.

ELF/LinkerScript.cpp
827–828

I think it might be best to keep Sec.ExpressionsUseSymbols and introduce a new bool variable called something like UsedInExpression and set that whenever ADDR and LOADADDR uses the name. Even though the effect (keep section) is the same, I think the reason to keep it is different and it gets lost in the comment. Perhaps something like:

// We do not want to remove OutputSections with expressions that reference symbols even if the OutputSection is empty. We want to ensure that the expressions can be evaluated and report an error if they cannot.
if (Sec.ExpressionsUseSymbols)
  return false;

// OutputSections may be referenced by name in ADDR and LOADADDR expressions, as an empty Section can has a valid VMA and LMA we keep the OutputSection to maintain the integrity of the other Expression.
if (Sec.UsedInExpression)
  return false;
grimar updated this revision to Diff 196607.Apr 25 2019, 3:25 AM

Thanks, Peter! I applied your suggestion and improved the test case.

ruiu accepted this revision.Apr 25 2019, 4:33 AM

LGTM

This revision is now accepted and ready to land.Apr 25 2019, 4:33 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2019, 11:57 PM