This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Stop setting output section size early
ClosedPublic

Authored by jhenderson on Sep 28 2017, 7:45 AM.

Details

Summary

This is an alternative to D38321. The motivation, as described in that patch, is the same:

The size of an OutputSection is calculated early, to aid handling of compressed debug sections. However, subsequent to this point, unused synthetic sections are removed. In the event that an OutputSection, from which such an InputSection is removed, is still required (e.g. because it has a symbol assignment), and no longer has any InputSections, dot assignments, or BYTE()-family directives, the size member is never updated when processing the commands. If the removed InputSection had a non-zero size (such as a .got.plt section), the section ends up with the wrong size in the output.

In this version, I have stopped estimating the value of the OutputSection Size and InputSection OutSecOff members early. There were a couple of issues I had to resolve, as part of this change:

  1. Compressed debug sections still need the Size set early, because they are not processed in assignAddresses. The original behaviour of setting the size early is maintained for such sections.
  2. The order of input sections is important for SHF_LINK_ORDER sections. This was previously relying on OutSecOff being an approximation for determining the final order. By making OutSecOff and OutputSection Size clearly not be the offset and size respectively, and instead making them represent indexes and section count, it should be much harder to misuse them.

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

jhenderson created this revision.Sep 28 2017, 7:45 AM
ruiu edited edge metadata.Oct 5 2017, 8:26 PM

This is much more complicated than https://reviews.llvm.org/D38321, but I wonder if it has to be. I feel like this part of code is becoming out of control.

Speaking of this patch, why don't you assign offsets in maybeCompress?

ELF/SyntheticSections.cpp
844–848 ↗(On Diff #116981)

What happens when a user starts modifying the output using the linker script?

jhenderson updated this revision to Diff 118016.Oct 6 2017, 9:25 AM

Moved offset and size calculation of compressed debug sections into maybeCompress as suggested. I have not got a setup that is able to run the corresponding tests, because it requires zlib (I tried installing it on my linux copy of LLVM and it still didn't work). If anybody is able to run these tests, I'd much appreciate it.

In D38361#890146, @ruiu wrote:

This is much more complicated than https://reviews.llvm.org/D38321, but I wonder if it has to be. I feel like this part of code is becoming out of control.

Are you referring to the MipsGot stuff, or the addSection code? I feel like this change makes the addSection code more correct, if anything, as it stops setting a "false" value for OutputSection::Size.

ELF/SyntheticSections.cpp
844–848 ↗(On Diff #116981)

The output section sizes will be different to what is estimated here, meaning that the number of GOT entries may not be entirely correct. I think that the only way around this is if we properly loop over assignAddresses until the sizes are all stable. This is an existing issue with the current behaviour anyway, and I see that @peter.smith already mentioned this in D29983#681632:

In the general case we will need to assign addresses multiple times as it is possible to write linker scripts where adding thunks causes address expressions in the script to change in ways that breaks the address calculations.

The size calculation for the GOT relies on the size calculated by assignAddresses, but that is not final until after thunk creation. As things stand, I think the code is slightly broken anyway, but I think it might be fixable by calling assignAddresses again immediately before the updateAllocSize call. I will look a that shortly.

Apologies for the drive by comment, I need to leave the office fairly soon. In this specific case might it be simpler to only early calculate the size of non SHF_ALLOC content such as debug sections as the OutputSection doesn't get an address, just an offset and doesn't affect the addresses of any other SHF_ALLOC content, it is therefore safe to calculate the OutputSection size early.

In the general case I think the late addition and removal of content such as Thunks, Mips GOT entries, errata fixes and I'm guessing at RiscV relaxation are going to require some kind of iteration to a fixed point. I usually find iterating to a fixed point better than trying to predict addresses/sizes as it is easy to miss hard to predict corner cases. I'm wondering if it is possible to make this kind of iteration clearer by having a kind of Relaxation() stage that we can add functions that may need to update addresses and sizes to, this should hopefully make such code easier to understand. Probably a separate series of refactorings though.

I recommend splitting out the SHF_LINK_ORDER change into a separate patch if it is unrelated.

Apologies for the drive by comment, I need to leave the office fairly soon. In this specific case might it be simpler to only early calculate the size of non SHF_ALLOC content such as debug sections as the OutputSection doesn't get an address, just an offset and doesn't affect the addresses of any other SHF_ALLOC content, it is therefore safe to calculate the OutputSection size early.

Unfortunately, it isn't safe: non-alloc sections can still be manipulated in linker scripts. For example, you could do something like .foobar : { *(.foo); . += 16; *(.bar) } in your script, which leaves a 16-byte gap in the output section, whose size is not accounted for in the early calculation, and is only later fixed by assignAddresses. This isn't an issue for compressed debug sections, because we explicitly ignore custom layout in scripts for such sections.

In the general case I think the late addition and removal of content such as Thunks, Mips GOT entries, errata fixes and I'm guessing at RiscV relaxation are going to require some kind of iteration to a fixed point. I usually find iterating to a fixed point better than trying to predict addresses/sizes as it is easy to miss hard to predict corner cases. I'm wondering if it is possible to make this kind of iteration clearer by having a kind of Relaxation() stage that we can add functions that may need to update addresses and sizes to, this should hopefully make such code easier to understand. Probably a separate series of refactorings though.

I completely agree with this general idea, although "Relaxation" is another term that is getting a bit too over-loaded for my taste (like "finalize"), so I'd call it something else.

I recommend splitting out the SHF_LINK_ORDER change into a separate patch if it is unrelated.

It's not entirely unrelated, but it could be a separate prerequisite patch. I'll look at splitting it off.

I recommend splitting out the SHF_LINK_ORDER change into a separate patch if it is unrelated.

It's not entirely unrelated, but it could be a separate prerequisite patch. I'll look at splitting it off.

I've created D38687 for this. It could be a standalone change, but the motivation is less important in that situation. @ruiu, if you prefer that I do it as part of this change directly, I'm happy to do so.

I'm going to leave the corresponding changes in this diff, unless the other review is committed.

jhenderson added inline comments.Oct 9 2017, 8:28 AM
ELF/SyntheticSections.cpp
844–848 ↗(On Diff #116981)

Whilst attempting to avoid this estimating by calling assignAddresses again, I found another issue, which is also theoretically present prior to my changes: the DT_MIPS_LOCAL_GOTNO depends on the number of local got entries, but these are dependent on the size calculated in updateAllocSize(). Updating this size, e.g. due to additional thunks being created, might render the the dynamic tag value invalid.

Whilst attempting to avoid this estimating by calling assignAddresses again, I found another issue, which is also theoretically present prior to my changes: the DT_MIPS_LOCAL_GOTNO depends on the number of local got entries, but these are dependent on the size calculated in updateAllocSize(). Updating this size, e.g. due to additional thunks being created, might render the the dynamic tag value invalid.

My apologies for missing that at the time, unfortunately I'm not too familiar with the Mips ABI. It looks like it could be fixed by implementing DynamicSection::postThunkContents() and updating the value of DT_MIPS_LOCAL_GOTNO. Luckily this won't change the size of the .dynamic section so it can be delayed.

jhenderson updated this revision to Diff 118217.Oct 9 2017, 9:31 AM
  1. Reworked when assignAddresses and applySynthetic are called with regards to the MIPS GOT section, so that it has access to the right variables. This allows me to get rid of the estimating of sizes, so is more correct.
  2. Delayed setting the DT_MIPS_LOCAL_GOTNO tag until a new postThunkContents for DynamicSection (thanks for the suggestion @peter.smith).
  3. Removed unnecessary references to string table sections when calling postThunkContents, since they do not implement this function.
  4. Removed the finalizeContents implementation for MIPS GOT since it does not do anything useful any more (previously it only called updateAllocSize, but too early to be useful).
  5. Updated the SHF_LINK_ORDER-related changes that are in the latest version of D38687.
  6. Fixed the line breaks in a comment in the area I was changing.
jhenderson edited the summary of this revision. (Show Details)Oct 9 2017, 9:33 AM
ruiu added inline comments.Oct 9 2017, 5:40 PM
ELF/SyntheticSections.cpp
1162–1164 ↗(On Diff #118217)

Add blank lines before and after this so that it is clear which code this comment is talking about.

1180 ↗(On Diff #118217)

Is there any way other than backfilling DT_MIPS_LOCAL_GOTNO?

For example, .eh_frame_hdr depends on .eh_frame, and that dependency is handled simply by calling .eh_frame's writeTo before .eh_frame_hdr's writeTo. Unless there is a circular dependency, you don't need to backfill a value.

jhenderson added inline comments.Oct 10 2017, 4:45 AM
ELF/SyntheticSections.cpp
1180 ↗(On Diff #118217)

You're right, there isn't actually a circular dependency here. We can set the value during writeTo. There is an alternative option - move the adding of the tag (and the DT_NULL tag) into postThunkContents (or optionally all tags), since I don't think there's any dependency on the dynamic section size in thunk creation or the MIPS GOT creation.

jhenderson marked an inline comment as done.

Rebased and addressed review comments:

  1. Add blank lines around comment/code block.
  2. Revert addition of postThunkContents for dynamic sections and instead assign the value for DT_MIPS_LOCAL_GOTNO during DynamicSection::writeTo.
jhenderson edited the summary of this revision. (Show Details)

Rebased following landing of range-based thunks. This allowed me to remove the additional passes I had added to get the addresses right in updateAllocSize for the MIPS GOT section.

jhenderson edited the summary of this revision. (Show Details)
  • Removed changes to do with DT_MIPS_LOCAL_GOTNO. These are present in D39493, which must land first (otherwise this change breaks the MIPS GOT tests).
  • Removed removal of MipsGot->finalizeContents(). This function is unnecessary, but harmless, so to make this diff simpler, I'm removing it for now. A later change could remove it if desired.

I was re-verifying this patch, prior to a ping, but I discovered that when applied, it highlighted a problem with rLLD318924, resulting in test failures with this patch. The problem is that that commit is another example of code relying on the section Size before it is finalized (and getting it wrong for some linker scripts cases). I have filed https://bugs.llvm.org/show_bug.cgi?id=35478 for it.

This made me wonder whether changing Size and OutSecOff to llvm::Optional type would bring some benefits. It would ensure that people cannot use the fields before they have been set for the first time (which with this patch would be in the first call to assignAddresses). The downside is the need to dereference it everywhere where it is used, so could cause quite a bit of code churn. Also, presumably tests would pick up any misuse after this point, so it might not be all that useful. Thoughts?

My apologies for PR35478, I'll take a look this afternoon. The main intention there was to guard against 0 sized SyntheticSections as the user can't always be held responsible for them. Personally I'd be in favour of making those fields optional as I clearly didn't have enough tests to catch the problem there. I also think that our tests, for good reasons, tend to look at one feature at a time and we miss out on cases where combinations of features cause problems.

I think that this should now be unblocked after the reverting the change that caused PR35478. I'd very much like to see this land as I'd like to implement ARM.exidx table compression and that will depend on the OutSecPos changes here.

@rafael wrote:
I don't like the idea of adding a member for the section position. At no point we need the position and the offset simultaneously and having both variables means they can get inconsistent.

As far as I'm aware there is no case where the two could get inconsistent, once set - that would require the ability to reorder the sections in the output section.

One way to look at the current issue is that the early estimate of the offset is correct in the common case, which makes it really easy to use it incorrectly before the final value is computed.

I agree that using a value that is very different is likely to be spotted, although it's not guaranteed to be (especially if working with only a single InputSection in an output section).

If we make the value almost always wrong, that problem goes away. What do you think of the attached patch? We could rename OutSecOff to OutSecOffsetOrIndex, but that doesn't seem worth it.

Given that both you and @ruiu seem reluctant to have new members, I've tentatively adopted it with a few changes and additions to the comments. I don't like using a member for two different purposes still, even if those two purposes don't overlap in time, but I'd rather get this change in in at least some form. Could you let me know what you think of the diff with the comment changes, please. In particular, I've moved the comments relating to the Size and OutSecOff members to make their dual purpose clear at their point of definition, and I've not added the suggested additional comments to maybeCompress - they're unnecessary as we already have a comment in assignOffsets saying that we don't support custom layout for compressed debug sections.

I'm wondering if it is feasible to check and disable compression if a non input section description command is detected in the debug section being compressed?

It should be possible to at least produce an error message.

I'd support something like that (it should be easy to add to the loop I'm adding in maybeCompress), but not as part of this change.

jhenderson edited the summary of this revision. (Show Details)Dec 11 2017, 5:54 AM

No objections from me. I've successfully rebased D40964 onto this change.

This revision was not accepted when it landed; it landed in state Needs Review.Dec 12 2017, 3:51 AM
This revision was automatically updated to reflect the committed changes.