This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Add --dwarf32-before-dwarf64 to sort DWARF32 input sections before DWARF64
Needs ReviewPublic

Authored by ikudrin on Feb 5 2021, 7:31 AM.

Details

Summary

DWARF64 debug info is suitable for cases when a particular debug section can be larger than 4GiB so that DWARF32 data cannot reference the higher parts of it. An application is usually linked against third-party libraries, and, while a user can control the debug info format of their own files, it can be difficult to enforce the 64-bit format for all inputs. Furthermore, as items of libraries are typically included in the linking after the inputs which reference them, their debug information also tends to be placed in higher positions, heightening the probability to cause the said issue.

The patch adds a switch that can be used to reorder debug info in output sections so that DWARF64 data is placed after DWARF32.

To simplify the implementation, the heuristic is used that all debug info in a single input file is stored in the same format. As the first relocation in a .debug_info section points a record in .debug_abbrev and its type depends on the format, that helps to assume the format of the file where that section belongs.

The patch is based on D91404, where the format was tried to be detected per section. Unfortunately, there are debug sections the used approach does not work on.

There were discussions in various mailing lists concerning the best ways to tackle the issue. There was no final decision achieved.

The main advantage of the proposed solution is that it is short, simple, and compatible with existing standards and tools. It does not degrade the performance of the linker for everyone who does not need it, yet resolves the issue in practice for those who come across it.

Diff Detail

Event Timeline

ikudrin created this revision.Feb 5 2021, 7:31 AM
ikudrin requested review of this revision.Feb 5 2021, 7:31 AM
MaskRay added a comment.EditedFeb 5 2021, 9:48 AM

I did not proceed with D91404 was because some replies in the generic-abi thread missed context and we have not entirely lost the possibility to use a section type to distinguish DWARF32/DWARF64. Personally I think how to make DWARF v4/v5 suitable could be a nice standard discussion, as v6 will have years to come.

(
I was frustrated to see that that we just concluded to a linker option without actively seeking for a proper binary format fix, even if Solaris folks could be against section type/flag.
In practice a lot of parties other than non-HPUX non-Solaris have converged on similar toolchains and we could proceed with a GNU flag, e.g.
https://sourceware.org/pipermail/binutils/2020-November/114191.html
)

There were so many ideas in the discussion that it looked like they not going to converge. The idea of waiting for v6 does not resolve the issue for existing standards, just postpones the possible solution for years with a yet unknown result. Adding new section names, flags, types, etc., might be promising at the first glance but requires updating lots of tools, including tools in different toolchains, which complicates achieving the result even further. That is why I feel it is necessary to illustrate my proposal with a complete patch, which is aimed to show that a simple, efficient, and standard-compliant solution is possible.

I've re-read a fair amount of the previous llvm-dev thread.
One goal of doing a patch like this was to collect some performance data; do we have that? In particular, performance data comparing link times with the option default-on versus default-off, so we can determine whether the difference is small enough that we should just do this processing unconditionally.

I'm also wondering whether it is reasonable to collect the is-32/64 characteristic of a section "along the way" during some other processing, rather than iterating over all input sections as a separate pass. It would have to be during or after the phase that attaches the reloc section to the input section of course.

lld/ELF/Driver.cpp
2417

Would a TimeTraceScope be helpful here? At least while gathering the initial performance data.

From what I remember the discussion went back and forth with no real conclusion. I might be miss remembering, so please correct me if I am wrong.
This patch can be a short/medium term bridge to a more comprehensive solution. If I am understanding it correctly it also deals with a problem of sections like .debug_loc.

I've re-read a fair amount of the previous llvm-dev thread.
One goal of doing a patch like this was to collect some performance data; do we have that? In particular, performance data comparing link times with the option default-on versus default-off, so we can determine whether the difference is small enough that we should just do this processing unconditionally.

As a modeling example, I took linking clang, where all LLVM libraries were in the DWARF32 format and clang libraries were DWARF64. The partitioning took about 5ms out of 8,950ms total link times. Such a ratio was expected; there are no time-greedy parts in the partitioning code. Are these the numbers you are asking for?

I'm also wondering whether it is reasonable to collect the is-32/64 characteristic of a section "along the way" during some other processing, rather than iterating over all input sections as a separate pass. It would have to be during or after the phase that attaches the reloc section to the input section of course.

That would require way more string comparisons to find .debug_info input sections in each input file, right? Contrary, this patch does just a few such comparisons over the output sections, assuming that the associated input sections are all of the expected kind.

Ping. What should be done to proceed with this?