This is an archive of the discontinued LLVM Phabricator instance.

[COFF] Keep temporary symbols in object files on ARM64, as IMAGE_SYM_CLASS_LABEL
AbandonedPublic

Authored by mstorsjo on Nov 14 2021, 2:41 PM.

Details

Summary

On ARM64, many symbol references use an adrp+add/ldr pair. When the
references are against a temporary symbol, the references are
normally rewritten as a reference against the start of the section,
plus an immediate offset.

In the relocation for an adrp instruction, the offset from the
referenced symbol is stored in the opcode, which fits a 21 bit
immediate. Thus, if there's an adrp relocation against a temporary
symbol, it ends up out of range if the temporary symbol is further
in than 1 MB in a section.

This addresses the issue reported at
https://bugs.llvm.org/show_bug.cgi?id=52378. It can also be worked
around by using the options -ffunction-sections and -fdata-sections.

By keeping the temporary symbols in the object files (e.g. as the symbol
type IMAGE_SYM_CLASS_LABEL), the relocations only need to store the offset
from this symbol instead of the start of the section.

This roughly matches what MSVC does, where the produced object files
contain lots of temporary label symbols named like $LN<n>.

This does inflate the size of the intermediate object files somewhat,
but makes these relocations much more robust.

Diff Detail

Event Timeline

mstorsjo created this revision.Nov 14 2021, 2:41 PM
mstorsjo requested review of this revision.Nov 14 2021, 2:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 14 2021, 2:41 PM
rnk added a comment.Nov 15 2021, 10:52 AM

This roughly matches what MSVC does, where the produced object files
contain lots of temporary label symbols named like $LN<n>.

I kind of always thought it was sloppy to emit all these extra compiler-generated symbols, FWIW.

This does inflate the size of the intermediate object files somewhat,
but makes these relocations much more robust.

That is somewhat concerning. LLVM uses a lot of labels internally, I think we should be careful about this change.

Is it possible to relativize these relocations against the most recent previous external symbol instead? Given that the code compiles with function/data sections, that seems like it would work.

The alternative here is that instead of relying on the symbols defined in the assembler source, we could automatically define a symbol every megabyte. If we expect the linker is throwing away the symbols anyway, though, the only effect on compiler-generated code is the number of symbols in the intermediate object, which isn't a big deal, I guess. We typically generate a lot of code labels, though. Do you have a number for how much we're inflating a typical object file?

Is it possible to relativize these relocations against the most recent previous external symbol instead? Given that the code compiles with function/data sections, that seems like it would work.

This isn't any guarantee there's an external symbol in range, in general.

The alternative here is that instead of relying on the symbols defined in the assembler source, we could automatically define a symbol every megabyte. If we expect the linker is throwing away the symbols anyway, though, the only effect on compiler-generated code is the number of symbols in the intermediate object, which isn't a big deal, I guess. We typically generate a lot of code labels, though. Do you have a number for how much we're inflating a typical object file?

I guess it's hard to define what a typical object file is, but from a sample of 5 large-ish and normal sized object files, they grew by 13%, 13%, 10%, 5% and 1%. So clearly notable and not ideal, but not totally out of the question either.

Is it possible to relativize these relocations against the most recent previous external symbol instead? Given that the code compiles with function/data sections, that seems like it would work.

This isn't any guarantee there's an external symbol in range, in general.

Yeah. I haven't studied the sequence ordering of what's fixed at what point in WinCOFFObjectWriter, but I'm wondering if the layout is known and fixed at the point when deciding whether to define a symbol or not - I'm afraid it isn't. If it was, we could choose to keep temporary symbols only when the previous external symbol is too far away.

The idea of producing extra symbols with 1 MB intervals sounds neat, but I'm wondering if it's possible to do that at the same time while doing choosing what symbols to define. Or maybe we can still add more symbols after fixing the layout? Then that would work.

But then, when doing the loop for fixing relocations, would that increase the complexity of that loop from O(n) into O(n^2), or at least O(n log n), as we have to search for the best symbol as reference point for each of them. Then again, if it only happens when the default choice is too far away, I guess it would be rare?

It shouldn't be a problem to define additional symbols after layout. "layout" only cares about the contents of the sections.

But then, when doing the loop for fixing relocations, would that increase the complexity of that loop from O(n) into O(n^2), or at least O(n log n), as we have to search for the best symbol as reference point for each of them.

You'd need to binary search or something to find the closest symbol.

If we have a symbol exactly every 1MB, and emit relocations relative to those symbols, we can do a constant-time mapping from offset to symbol. (This might not find the closest symbol if the section has other symbols.)

It shouldn't be a problem to define additional symbols after layout. "layout" only cares about the contents of the sections.

But then, when doing the loop for fixing relocations, would that increase the complexity of that loop from O(n) into O(n^2), or at least O(n log n), as we have to search for the best symbol as reference point for each of them.

You'd need to binary search or something to find the closest symbol.

If we have a symbol exactly every 1MB, and emit relocations relative to those symbols, we can do a constant-time mapping from offset to symbol. (This might not find the closest symbol if the section has other symbols.)

Ok, this sounds doable. I'll try to give it a shot.

It shouldn't be a problem to define additional symbols after layout. "layout" only cares about the contents of the sections.

But then, when doing the loop for fixing relocations, would that increase the complexity of that loop from O(n) into O(n^2), or at least O(n log n), as we have to search for the best symbol as reference point for each of them.

You'd need to binary search or something to find the closest symbol.

If we have a symbol exactly every 1MB, and emit relocations relative to those symbols, we can do a constant-time mapping from offset to symbol. (This might not find the closest symbol if the section has other symbols.)

Ok, this sounds doable. I'll try to give it a shot.

This was pretty straightforward, see D114340.

mstorsjo abandoned this revision.Nov 26 2021, 3:47 AM

This one was superseded by D114340.