This is an archive of the discontinued LLVM Phabricator instance.

[ELF] report section sizes when output file too large
ClosedPublic

Authored by inglorion on Jan 12 2021, 3:34 PM.

Details

Summary

Fixes PR48523. When the linker errors with "output file too large",
one question that comes to mind is how the section sizes differ from
what they were previously. Unfortunately, this information is lost
when the linker exits without writing the output file. This change
makes it so that the error message includes the sizes of the largest
sections.

Diff Detail

Event Timeline

inglorion created this revision.Jan 12 2021, 3:34 PM
inglorion requested review of this revision.Jan 12 2021, 3:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2021, 3:34 PM

My feeling is similar to D94141: use either -M or --noinhibit-exec to get an output.

My feeling is similar to D94141: use either -M or --noinhibit-exec to get an output.

Thank you for reviewing. Quoting from D94141:

We accept new features very cautiously. Adding an option has maintenance costs and recognition cost if some options have overlapping features. You can dig up the history for previous options. They need to show values to the community. For this one, I don't think it provides any additional benefit and can just confuse users as it duplicates --print-map/-M. If you want the output, add --noinhibit-exec and inspect the output with readelf -S.

I do want to point out that the intent of the present change is to aid LLD maintenance. Note that it does not add any additional command-line options; it just provides a more informative error message in a specific failure scenario. In that sense, it's similar in spirit to printing backtraces on crashes or providing "referenced by" information for undefined symbols, both of which we do. It helps us in cases where something in LLVM changes and pushes us over the size limit, by saving the requirement to reproduce the failure just to get information that the linker already had when it failed.

Does that change the calculus?

As for the specific alternatives you suggested:

--noinhibit-exec does not actually write an output file when we hit the "output file too large" error. So that doesn't help us here. I'm not sure what such an output file would look like or if it would be possible to parse it with readelf even if we did write it, because the limit comes from things like offsets and section sizes larger than 4GB not being representable in ELF32.

-M may help, but in order to support the use case of figuring out how section sizes changed between a good run and a bad run, we would need to either reproduce the bad run, or run with -M even in the vast majority of cases where we don't need it. Since we are talking about huge outputs here, this presumably adds appreciable cost to every run.

I did consider an alternative where we add an option to print the map only if the link failed (which I think we don't already have), but that actually seems more of a future commitment than not adding an option and just making the error message more informative. If you do want to go that way, I would be happy to implement it, though.

grimar added a comment.EditedJan 14 2021, 12:05 AM

FWIW, I think we could support this, but I'd implement it in a much more trivial way.
E.g. I think it can be just a 2 lines loop which would iterate over all output sections and print their names ans sizes.
No need to sort them or to calculate fields width.

Something like:

if (fileSize != size_t(fileSize) || maxSize < fileSize) {
  std::string msg = "output file too large: " + Twine(fileSize) + " bytes\n"
    + "Output sections:\n";
  for (OutputSection *s : outputSections)
     msg += ...;
  error(s.str());
  return;
}

I don't feel strong about this suggestion though.

I wonder whether the issue could be better solved by adding something (probably output section sizes as @grimar suggests) to the verbose output? That would allow us to keep the error message more concise, whilst still providing a developer the ability to answer the question of "where does the data go?"

Thanks for the comments!

E.g. I think it can be just a 2 lines loop which would iterate over all output sections and print their names ans sizes.
No need to sort them or to calculate fields width.

The reason I do the sorting is because there can potentially be very many sections. Imagine a large binary with -fdata-sections -ffunction-settings. Similar to how we have an error limit to avoid spewing a bazillion error messages, I thought it would be a good idea to show the largest few sections, but not every handful of bytes datum or function in the program. That comes at the cost of adding a call to std::partial_sort_copy(), which doesn't seem too much of a maintenance headache.

As for formatting, I would be happy to remove that if we feel that would be better. I figured some sense of consistency between output formats would be good, but it's not actually the same as objdump -h anyway, so maybe it's not worth the code? Let me know what you think.

I wonder whether the issue could be better solved by adding something (probably output section sizes as @grimar suggests) to the verbose output?

Not sure I understand what precisely you are suggesting. Are you saying to only show this information when --verbose is passed?

grimar added a comment.EditedJan 15 2021, 12:29 AM

Thanks for the comments!

E.g. I think it can be just a 2 lines loop which would iterate over all output sections and print their names ans sizes.
No need to sort them or to calculate fields width.

The reason I do the sorting is because there can potentially be very many sections. Imagine a large binary with -fdata-sections -ffunction-settings.

I might be missing something, but -fdata-sections -ffunction-settings affects on number of input sections, but here we iterate over output sections.
I believe that the number of output sections is usually small. E.g. I'd not expect to see more than 10-50 in most of the common cases?

I wonder whether the issue could be better solved by adding something (probably output section sizes as @grimar suggests) to the verbose output?

Not sure I understand what precisely you are suggesting. Are you saying to only show this information when --verbose is passed?

I think this is what @jhenderson meant, yes.

An alternative to this could probably be changing the Writer<ELFT>::assignFileOffsets()
It has the following loop:

for (OutputSection *sec : outputSections) {
  if (sec->type == SHT_NOBITS)
    continue;
  if ((sec->offset > fileSize) || (sec->offset + sec->size > fileSize))
    error("unable to place section " + sec->name + " at file offset " +
          rangeToString(sec->offset, sec->size) +
          "; check your linker script for overflows");
}

Perhaps it is possible to check for overflowing maxSize too right there?

I wonder whether the issue could be better solved by adding something (probably output section sizes as @grimar suggests) to the verbose output?

Not sure I understand what precisely you are suggesting. Are you saying to only show this information when --verbose is passed?

I think this is what @jhenderson meant, yes.

Yes, that is what I meant. We could limit it to only print the information on overflow and --verbose is used, on stdout like other verbose output.

inglorion updated this revision to Diff 317055.Jan 15 2021, 1:05 PM

Indeed, there are not that many output sections. I simplified the code
by removing the sorting and formatting.

inglorion updated this revision to Diff 317073.Jan 15 2021, 2:09 PM

removed now unused <algorithm>

inglorion updated this revision to Diff 317074.Jan 15 2021, 2:11 PM

also removed unused Format.h include

You may add tests to test/ELF/linkerscript/output-too-large.s. You'll need at least another output section to demonstrate the effects.

Technically printing everything can result in large amount of output in -r mode.. but perhaps the combination of output-size-overflow and (-r || linker scripts with underspecified input section descriptions) is too rare..

MaskRay added inline comments.Jan 15 2021, 2:23 PM
lld/ELF/Writer.cpp
2878
2880

(Nit: " " can be ' ' (slightly smaller output))

inglorion updated this revision to Diff 317106.Jan 15 2021, 4:16 PM

implemented MaskRay's suggestions, added tests

MaskRay added inline comments.Jan 15 2021, 4:22 PM
lld/test/ELF/linkerscript/output-too-large.s
32 ↗(On Diff #317106)

.text preceding .data is fairly stable.

You can just use -NEXT: to strengthen the test.

(DAG: does not check there is no interleaving line)

Thanks again.

I chose DAG on purpose because I think it more accurately expresses what we're really asserting.

First, we since we removed the sorting, we're not guaranteeing that these will be in any particular order. Given that, I would rather not make the test depend on the order.

As for interleaving lines: we're not really guaranteeing there won't be any other sections, either. In fact, if I remove the parts that cause linking to fail, there is also a .comment section in the output.

So we want this to report at least the .text and .data section which are in the input, with the correct sizes, but we're not making any promises about these being the only sections or in which order the sections are listed. Which is exactly what DAG gives us.

Can we keep this as-is?

grimar added a comment.EditedJan 18 2021, 12:22 AM

Thanks again.

I chose DAG on purpose because I think it more accurately expresses what we're really asserting.

First, we since we removed the sorting, we're not guaranteeing that these will be in any particular order. Given that, I would rather not make the test depend on the order.

Actually we give a guarantee: the order of output sections must be stable. When using -NEXT we verify that we print section sizes in the same order as they present in the file.
Using -DAG relaxes this check. Also, the current version of the test case looks like we have a sorting, though we don't.

So, personally I also think that using -NEXT is would be better here. Also wonder what @jhenderson thinks.

Thanks again.

I chose DAG on purpose because I think it more accurately expresses what we're really asserting.

First, we since we removed the sorting, we're not guaranteeing that these will be in any particular order. Given that, I would rather not make the test depend on the order.

Actually we give a guarantee: the order of output sections must be stable. When using -NEXT we verify that we print section sizes in the same order as they present in the file.
Using -DAG relaxes this check. Also, the current version of the test case looks like we have a sorting, though we don't.

So, personally I also think that using -NEXT is would be better here. Also wonder what @jhenderson thinks.

I think there are arguments for both sides, and I'm on the fence as to which is better in this case. If I were writing this myself, I'd probably have gone with -NEXT on the basis that we want to test the output is exactly what we expected to produce (no more, no less).

inglorion updated this revision to Diff 318070.Jan 20 2021, 5:16 PM

Thanks again for the comments. I've modified the test to use -NEXT as
requested.

grimar accepted this revision.Jan 20 2021, 11:29 PM

This LGTM, but please wait for others.

This revision is now accepted and ready to land.Jan 20 2021, 11:29 PM
jhenderson accepted this revision.Jan 21 2021, 12:08 AM

LGTM too.

MaskRay accepted this revision.Jan 21 2021, 10:19 AM
This revision was automatically updated to reflect the committed changes.

Thanks, everyone!