In D58047 Synthesise missing .ARM.exidx sections. Rui made the comment
We used to handle .ARM.exidx sections as regular sections with a sentinel section that is synthesized and added to the end. We later added code to merge contiguous .ARM.exidx sections, and now this patch is adding sentinels at various points. At this point, maybe it is easier to handle .ARM.exidx as a single synthetic section just like MergeInputSection? That synthetic section removes all input section whose name is .ARM.exidx from the input section list and add itself to the input section. That design gives you more flexibility than the current design, as the intermediate data representation doesn't have to be an InputSection.
This is an implementation of the idea to handle .ARM.exidx processing inside a single SyntheticSection. It incorporates the changes in D58047 and reuses the test modifications. There is one additional test change to arm-data-prel.s to account for the single SyntheticSection. I've put what I think the advantages and disadvantages of the approach are below. At this stage I've tested it on the LLD test suite, I'm going to do some more testing on real world programs to make sure I haven't made any mistakes, but I think the approach can be made to work.
Assumptions:
- We can't get rid of SHF_LINK_ORDER as this is used by the sanitizers
- There is at most one .ARM.exidx OutputSection per link unit. This assumption may be broken with pcc's executable split into shared libraries proposal, but I think that this can be expanded to cover that case).
Requirements:
- .ARM.exidx InputSections form a single contiguous table with each entry a pair (Offset to function, Unwind instructions for function).
- The table must be ordered in the OutputSection according to the ascending order of function address.
- The address range of a table entry is [Address of function, Address of next function in table)
- The linker should synthesise table entries for functions that don't have them in order to terminate their address range.
- The linker should synthesise a terminating sentinel entry as some unwinders require it.
- The exidx_start and exidx_end symbols must be defined if referenced at the start and end of the table.
- A PT_ARM_EXIDX program header is created to describe the location of the contiguous range of .arm.exidx sections
- All .ARM.exidx InputSections including any synthetic sections can be DISCARDED
Design:
Instead of working backwards from the .ARM.exidx InputSections to the executable sections, work forwards from the executable sections in the output to find the .ARM.exidx sections in their dependentSections. This permits us to synthesize the EXIDX_CANTUNWIND range terminators and the sentinel in the table without having to create a new type of SyntheticSection to represent the linker created sections.
Changes:
- Writer.cpp: Instead of creating an ARMExidxSentinelSection we create an ARMExidxSyntheticSection that marks all the .ARM.exidx InputSections as not live.
- Writer.cpp: At SHF_LINK_ORDER processing time we call the finalizeContent() member function of ARMExidxSyntheticSection to perform the ordering and table compression.
- SyntheticSections.cpp: The writeTo() member function of the ARMExidxSyntheticSection needs to derive the table contents from the Executable InputSections.
Advantages:
- The vast majority of the ARM specific code is in ARMExidxSyntheticSection.
- We don't need to create multiple range terminating sentinel sections.
Disadvantages:
- To avoid duplicating a lot of the relocation code writeTo() takes advantage of the .ARM.exidx table structure. Modifying it may need more ARM specific knowledge.
I've kept D58047 open as it may be preferable to keep the old design rather than go down this path.