This is an archive of the discontinued LLVM Phabricator instance.

[MC][COFF][AArch64] Avoid incorrect IMAGE_REL_ARM64_BRANCH26 relocations.
ClosedPublic

Authored by hjyamauchi on Jul 19 2023, 10:59 AM.

Details

Summary

For a b/bl instruction that branches a temporary symbol (private assembly label), an IMAGE_REL_ARM64_BRANCH26 relocation to another symbol + a non-zero offset could be emitted but the linkers don't support this type of relocation and cause incorrect relocations and crashes. Avoid emitting this type of relocations.

Diff Detail

Event Timeline

hjyamauchi created this revision.Jul 19 2023, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 19 2023, 10:59 AM
hjyamauchi requested review of this revision.Jul 19 2023, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 19 2023, 10:59 AM

https://learn.microsoft.com/en-us/windows/win32/debug/pe-format indicates that there does exist such a relocation:

IMAGE_REL_ARM64_BRANCH26 0x0003 The 26-bit relative displacement to the target, for B and BL instructions.

If this relocation is unsupported, we should just disable that relocation from being formed or we should figure out what the linker expectations are.

compnerd added a subscriber: tentzen.

https://learn.microsoft.com/en-us/windows/win32/debug/pe-format indicates that there does exist such a relocation:

IMAGE_REL_ARM64_BRANCH26 0x0003 The 26-bit relative displacement to the target, for B and BL instructions.

If this relocation is unsupported, we should just disable that relocation from being formed or we should figure out what the linker expectations are.

Yeah. To be clearer, what I observed was that the relocation type works as long as the offset that's embedded in the instruction is zero (I suspect that the linker doesn't use the offset in relocation or assumes it's zero.) I also tried a small local patch in lld to take this non-zero offset into account in the relocation and it also avoided the crashes I'm seeing.

https://learn.microsoft.com/en-us/windows/win32/debug/pe-format indicates that there does exist such a relocation:

IMAGE_REL_ARM64_BRANCH26 0x0003 The 26-bit relative displacement to the target, for B and BL instructions.

If this relocation is unsupported, we should just disable that relocation from being formed or we should figure out what the linker expectations are.

Yeah. To be clearer, what I observed was that the relocation type works as long as the offset that's embedded in the instruction is zero (I suspect that the linker doesn't use the offset in relocation or assumes it's zero.) I also tried a small local patch in lld to take this non-zero offset into account in the relocation and it also avoided the crashes I'm seeing.

That sounds like the correct thing to do; if a relocation with an offset isn’t supported by lld (but it is by link.exe), we should fix the linker instead of trying to avoid generating it.

I had expected those cases to be working correctly though, for many years now, but perhaps there is some case where the original offset indeed happens to get ignored.

That sounds like the correct thing to do; if a relocation with an offset isn’t supported by lld (but it is by link.exe), we should fix the linker instead of trying to avoid generating it.

The description sounds like it's also not supported by link.exe? Not sure.


We should teach adjustFixupValue() in AArch64AsmBackend.cpp to print an error if an unresolved branch with an offset gets there somehow.

There's a bunch of code for A.isTemporary() a few lines earlier; would it make sense to integrate with that? This is sort of similar to what we do with OffsetSymbols, except the maximum offset is zero.

You can't create symbols at arbitrary offsets from the start of a section; you need to check that the symbol actually points somewhere within the section.

llvm/lib/MC/WinCOFFObjectWriter.cpp
963

We shouldn't end up in recordRelocation for branches where the source and destination are in the same section, I think.

That sounds like the correct thing to do; if a relocation with an offset isn’t supported by lld (but it is by link.exe), we should fix the linker instead of trying to avoid generating it.

The description sounds like it's also not supported by link.exe? Not sure.

Yes. As far as I can tell, link.exe doesn't support this relocation type with an offset and that matches what lld does.

https://learn.microsoft.com/en-us/windows/win32/debug/pe-format indicates that there does exist such a relocation:

IMAGE_REL_ARM64_BRANCH26 0x0003 The 26-bit relative displacement to the target, for B and BL instructions.

If this relocation is unsupported, we should just disable that relocation from being formed or we should figure out what the linker expectations are.

Yeah. To be clearer, what I observed was that the relocation type works as long as the offset that's embedded in the instruction is zero (I suspect that the linker doesn't use the offset in relocation or assumes it's zero.) I also tried a small local patch in lld to take this non-zero offset into account in the relocation and it also avoided the crashes I'm seeing.

That sounds like the correct thing to do; if a relocation with an offset isn’t supported by lld (but it is by link.exe), we should fix the linker instead of trying to avoid generating it.

I would agree, but it seems that link also does not support the relocation with addend, and I would rather that we retain compatibility with the reference implementation. It sounds like it is only the addend form that we would need to avoid on Windows, not the general relocation.

I had expected those cases to be working correctly though, for many years now, but perhaps there is some case where the original offset indeed happens to get ignored.

MaskRay added inline comments.
llvm/lib/MC/WinCOFFObjectWriter.cpp
963

Ideally we should reuse the temporary symbol, not creating a new one.

The ELF approach is to omit temporary symbols only when unused by relocations (see ELFWriter::isInSymtab).

COFF D14975 may give some insights as well.

https://learn.microsoft.com/en-us/windows/win32/debug/pe-format indicates that there does exist such a relocation:

IMAGE_REL_ARM64_BRANCH26 0x0003 The 26-bit relative displacement to the target, for B and BL instructions.

If this relocation is unsupported, we should just disable that relocation from being formed or we should figure out what the linker expectations are.

Yeah. To be clearer, what I observed was that the relocation type works as long as the offset that's embedded in the instruction is zero (I suspect that the linker doesn't use the offset in relocation or assumes it's zero.) I also tried a small local patch in lld to take this non-zero offset into account in the relocation and it also avoided the crashes I'm seeing.

That sounds like the correct thing to do; if a relocation with an offset isn’t supported by lld (but it is by link.exe), we should fix the linker instead of trying to avoid generating it.

I would agree, but it seems that link also does not support the relocation with addend, and I would rather that we retain compatibility with the reference implementation. It sounds like it is only the addend form that we would need to avoid on Windows, not the general relocation.

I had expected those cases to be working correctly though, for many years now, but perhaps there is some case where the original offset indeed happens to get ignored.


The attach text file contains a test (*) and the command line trace that shows the relocation behaviors (without this patch) in the first half, and the same sequence of commands with this patch in the second half.

The first part shows that both link.exe and lld relocate the branches to the beginning of the text and the other section, as opposed to the ret instructions, respectively, because the offsets embedded in the instruction are ignored. The second part shows that, with this patch, the linkers relocate them as expected.

(*) A tweaked version of the test in this patch (because some important details were unexpectedly lost along the test reduction and nops were added to avoid accidental offset correctness as small offsets from a symbol can happen to be correct).

We should teach adjustFixupValue() in AArch64AsmBackend.cpp to print an error if an unresolved branch with an offset gets there somehow.

If you mean like the following, that would unintentionally trigger when we drop the relocation in the newly-added code (in the if (MCSec == &A.getSection()) case) because IsResolved is false.

case AArch64::fixup_aarch64_pcrel_branch26:
case AArch64::fixup_aarch64_pcrel_call26:
  if (TheTriple.isOSBinFormatCOFF() && !IsResolved && SignedValue != 0) {
    // MSVC link.exe and lld do not support this relocation type with a non-zero offset
    Ctx.reportError(Fixup.getLoc(), "fixup value not supported by the linkers");
  }

There's a bunch of code for A.isTemporary() a few lines earlier; would it make sense to integrate with that? This is sort of similar to what we do with OffsetSymbols, except the maximum offset is zero.

That may make sense as long as it's okay to check if the relocation type is fixup_aarch64_pcrel_branch26 or fixup_aarch64_pcrel_call26 before the getRelocType() call.

You can't create symbols at arbitrary offsets from the start of a section; you need to check that the symbol actually points somewhere within the section.

Can you clarify? Does it mean that we need to check A.isInSection() && Layout.getSymbolOffset(A) < getSectionAddressSize(&A.getSection()) before we create a new symbol at where symbol A is? What should happen if that's not the case?

I had expected those cases to be working correctly though, for many years now, but perhaps there is some case where the original offset indeed happens to get ignored.


The attach text file contains a test (*) and the command line trace that shows the relocation behaviors (without this patch) in the first half, and the same sequence of commands with this patch in the second half.

The first part shows that both link.exe and lld relocate the branches to the beginning of the text and the other section, as opposed to the ret instructions, respectively, because the offsets embedded in the instruction are ignored. The second part shows that, with this patch, the linkers relocate them as expected.

(*) A tweaked version of the test in this patch (because some important details were unexpectedly lost along the test reduction and nops were added to avoid accidental offset correctness as small offsets from a symbol can happen to be correct).

Thanks for the test transcript! It does indeed look like link.exe ignores the existing offset. I also independently tested this, and it actually looks like link.exe blindly ors the target offset bits on top of the existing opcode, just like lld does: https://github.com/llvm/llvm-project/blob/llvmorg-16.0.6/lld/COFF/Chunks.cpp#L306-L322 Depending on the contents of the original offset and the intended symbol difference, the offset may end up being partially visible or not. In any case - the linker doesn't indeed seem to be able to handle immediate offsets here.

Strictly speaking I would consider that a bug in both linkers, but I totally agree we should avoid producing such relocations. I didn't have time to look at the implementation strategy itself yet (but trust others' judgement on it).

Thanks for the test transcript! It does indeed look like link.exe ignores the existing offset. I also independently tested this, and it actually looks like link.exe blindly ors the target offset bits on top of the existing opcode, just like lld does: https://github.com/llvm/llvm-project/blob/llvmorg-16.0.6/lld/COFF/Chunks.cpp#L306-L322 Depending on the contents of the original offset and the intended symbol difference, the offset may end up being partially visible or not. In any case - the linker doesn't indeed seem to be able to handle immediate offsets here.

My recollection is that for most relocations, link does not support (in GNU parlance) partial in-place relocations, that is, any addend is generally (though not always) ignored. It was a pretty big divergence between GNU and MSVC semantics.

Strictly speaking I would consider that a bug in both linkers, but I totally agree we should avoid producing such relocations. I didn't have time to look at the implementation strategy itself yet (but trust others' judgement on it).

I'm not sure I consider it a bug since that is the normal handling of relocations, but certainly a lacking useful feature. At least, it sounds like everyone is now in agreement on avoiding producing the relocation for Windows then.

Thanks for the test transcript! It does indeed look like link.exe ignores the existing offset. I also independently tested this, and it actually looks like link.exe blindly ors the target offset bits on top of the existing opcode, just like lld does: https://github.com/llvm/llvm-project/blob/llvmorg-16.0.6/lld/COFF/Chunks.cpp#L306-L322 Depending on the contents of the original offset and the intended symbol difference, the offset may end up being partially visible or not. In any case - the linker doesn't indeed seem to be able to handle immediate offsets here.

The lld code linked here is exactly what I locally modified to experiment and saw the crash go away with.

My recollection is that for most relocations, link does not support (in GNU parlance) partial in-place relocations, that is, any addend is generally (though not always) ignored. It was a pretty big divergence between GNU and MSVC semantics.

Strictly speaking I would consider that a bug in both linkers, but I totally agree we should avoid producing such relocations. I didn't have time to look at the implementation strategy itself yet (but trust others' judgement on it).

I'm not sure I consider it a bug since that is the normal handling of relocations, but certainly a lacking useful feature. At least, it sounds like everyone is now in agreement on avoiding producing the relocation for Windows then.

Yeah, as I implied by creating this revision, I also vote for avoiding this relocation type for compatibility.

We should teach adjustFixupValue() in AArch64AsmBackend.cpp to print an error if an unresolved branch with an offset gets there somehow.

If you mean like the following, that would unintentionally trigger when we drop the relocation in the newly-added code (in the if (MCSec == &A.getSection()) case) because IsResolved is false.

case AArch64::fixup_aarch64_pcrel_branch26:
case AArch64::fixup_aarch64_pcrel_call26:
  if (TheTriple.isOSBinFormatCOFF() && !IsResolved && SignedValue != 0) {
    // MSVC link.exe and lld do not support this relocation type with a non-zero offset
    Ctx.reportError(Fixup.getLoc(), "fixup value not supported by the linkers");
  }

I still don't understand why the MCSec == &A.getSection() case is relevant; if the offset is constant, it should be resolved, since "resolved" means "doesn't need a relocation".

You can't create symbols at arbitrary offsets from the start of a section; you need to check that the symbol actually points somewhere within the section.

Can you clarify? Does it mean that we need to check A.isInSection() && Layout.getSymbolOffset(A) < getSectionAddressSize(&A.getSection()) before we create a new symbol at where symbol A is? What should happen if that's not the case?

Something like that, yes. If it fails, we need to print an error, because we don't have any way to encode it. At least, I don't think we have a way to encode it? Not sure off the top of my head what offsets the linker would tolerate.

Updated the revision with the better test and changes per review comments.

There's a bunch of code for A.isTemporary() a few lines earlier; would it make sense to integrate with that? This is sort of similar to what we do with OffsetSymbols, except the maximum offset is zero.

That may make sense as long as it's okay to check if the relocation type is fixup_aarch64_pcrel_branch26 or fixup_aarch64_pcrel_call26 before the getRelocType() call.

Done.

hjyamauchi marked 2 inline comments as done.Jul 22 2023, 12:18 PM
hjyamauchi added inline comments.
llvm/lib/MC/WinCOFFObjectWriter.cpp
963

Done. Reused the temporary symbol.

963

(Please see the other comment discussion)

hjyamauchi marked an inline comment as done.Jul 22 2023, 12:20 PM

We should teach adjustFixupValue() in AArch64AsmBackend.cpp to print an error if an unresolved branch with an offset gets there somehow.

If you mean like the following, that would unintentionally trigger when we drop the relocation in the newly-added code (in the if (MCSec == &A.getSection()) case) because IsResolved is false.

case AArch64::fixup_aarch64_pcrel_branch26:
case AArch64::fixup_aarch64_pcrel_call26:
  if (TheTriple.isOSBinFormatCOFF() && !IsResolved && SignedValue != 0) {
    // MSVC link.exe and lld do not support this relocation type with a non-zero offset
    Ctx.reportError(Fixup.getLoc(), "fixup value not supported by the linkers");
  }

I still don't understand why the MCSec == &A.getSection() case is relevant; if the offset is constant, it should be resolved, since "resolved" means "doesn't need a relocation".

Good point. Removed the MCSec == &A.getSection() case. There seem to be edges cases where relocations are kept even if the offset is constant such as a) relocations between functions are kept for the MSVC linker features as in WinCOFFObjectWriter::isSymbolRefDifferenceFullyResolvedImpl and b) the ADRP instruction / fixup_aarch64_pcrel_adrp_imm21 type as in AArch64AsmBackend::shouldForceRelocation. The crash case and the test we have here is due to a). So we cannot drop relocations here. Added the suggested check in adjustFixupValue.

Can you clarify? Does it mean that we need to check A.isInSection() && Layout.getSymbolOffset(A) < getSectionAddressSize(&A.getSection()) before we create a new symbol at where symbol A is? What should happen if that's not the case?

Something like that, yes. If it fails, we need to print an error, because we don't have any way to encode it. At least, I don't think we have a way to encode it? Not sure off the top of my head what offsets the linker would tolerate.

Done.

Please take another look.

efriedma added inline comments.Jul 23 2023, 9:20 PM
llvm/lib/MC/WinCOFFObjectWriter.cpp
953

Asm.getContext().reportError(), like code earlier in this function.

962

I don't understand why it's safe to throw away the "FixedValue" here; there could be an explicit offset.

hjyamauchi marked 2 inline comments as done.

Update to address comments.

Please take another look. @efriedma

llvm/lib/MC/WinCOFFObjectWriter.cpp
953

Done.

962

Done.

Thinking about this a bit further, I think that it might be better to conform to the Microsoft behaviour here. If we look at what MASM does, there is no concept of an "assembler temporary" hidden symbols. We should instead always emit all symbols into the symbol table. The linker is responsible for filtering the symbols that are emitted into the symbol table (only symbols marked for DLL export should be emitted there). This is one of the fundamental differences between PE and COFF. COFF will record all symbols (e.g. consider what happens when you build with debug information, all the temporary symbols for lines, etc. are emitted into the object file), PE will emit only the public symbols (technically, it does not have a symbol table at all, only the content needed for the EAT). This would bring us more in line with the expectations of the platform and the object file format and also avoid the need for any special casing here - we would form the relocation as per usual, the linker will have the necessary context, and then we would simply drop the "temporary" symbol which is not marked for DLL Export.

compnerd added a subscriber: rnk.Aug 3 2023, 3:24 PM
compnerd added a subscriber: aganea.Aug 5 2023, 6:19 PM

Thinking about this a bit further, I think that it might be better to conform to the Microsoft behaviour here. If we look at what MASM does, there is no concept of an "assembler temporary" hidden symbols. We should instead always emit all symbols into the symbol table.

I don't think that we can directly compare what we do to masm; we support an assembler dialect which is not masm, and that includes branching where the destination is an offset from a symbol. Given that, I think we need this logic in WinCOFFObjectWriter, whether or not we emit all temporary symbols. (The change in AArch64AsmBackend.cpp is necessary either way.)

Also, emitting all temporary symbols significantly bloats object files. (I don't have numbers at hand, but I recall it being significant.)

llvm/lib/MC/WinCOFFObjectWriter.cpp
962

This still looks wrong in the case where SymbolMap[&A] is non-null; even if the initial symbol lookup succeeds, we still need to preserve the offset.

I don't think that we can directly compare what we do to masm; we support an assembler dialect which is not masm, and that includes branching where the destination is an offset from a symbol. Given that, I think we need this logic in WinCOFFObjectWriter, whether or not we emit all temporary symbols. (The change in AArch64AsmBackend.cpp is necessary either way.)

I guess the piece that I'm wondering is why we would need that if the local symbol is available because the label is preserved, the linker should be able to resolve the symbol relative offset and then discard the temporary symbol I thought. What condition am I overlooking?

Also, emitting all temporary symbols significantly bloats object files. (I don't have numbers at hand, but I recall it being significant.)

Oh, I completely agree with you on that point. I wouldn't be surprised at all at it being significant. I think that matching semantics is something we should do any way.

I don't think that we can directly compare what we do to masm; we support an assembler dialect which is not masm, and that includes branching where the destination is an offset from a symbol. Given that, I think we need this logic in WinCOFFObjectWriter, whether or not we emit all temporary symbols. (The change in AArch64AsmBackend.cpp is necessary either way.)

I guess the piece that I'm wondering is why we would need that if the local symbol is available because the label is preserved, the linker should be able to resolve the symbol relative offset and then discard the temporary symbol I thought. What condition am I overlooking?

I was thinking of the case of branching to "x+4", or something like that, where there isn't any existing symbol at the offset in question. We could forbid that, I guess, but it seems like that just shifts complexity elsewhere.

compnerd added a comment.EditedAug 8 2023, 1:57 PM

I don't think that we can directly compare what we do to masm; we support an assembler dialect which is not masm, and that includes branching where the destination is an offset from a symbol. Given that, I think we need this logic in WinCOFFObjectWriter, whether or not we emit all temporary symbols. (The change in AArch64AsmBackend.cpp is necessary either way.)

I guess the piece that I'm wondering is why we would need that if the local symbol is available because the label is preserved, the linker should be able to resolve the symbol relative offset and then discard the temporary symbol I thought. What condition am I overlooking?

I was thinking of the case of branching to "x+4", or something like that, where there isn't any existing symbol at the offset in question. We could forbid that, I guess, but it seems like that just shifts complexity elsewhere.

Right, I was thinking that we would have a(n extra) label at that point so the offset is always 0. We would generate a synthetic symbol for any case where the user is writing out an offset from a symbol.

e.g.

  .def sym
    .scl 2
    .type 32
  .endef
sym:
  nop
  nop
$LN$reloc:
  .long $LN$reloc
  ret

would be equivalent to:

  .def sym
    .scl 2
    .type 32
  .endef
sym:
  nop
  nop
  .long sym + 4
  ret

Edit: oh, I see what you mean - we do not know what the target was lowered as and we might be external to the symbol, which means that we cannot guarantee the label at that point ...

rnk added a comment.Aug 9 2023, 2:40 PM

I agree that we can't emit all labels with the .L prefix into the object file symbol table. That would be prohibitively expensive.

We can, however, probably treat all private linkage symbols as internal if we don't already, so we get symbols for them in the symbol table, if this issue is still reachable from the IR level. We can probably also "upgrade" private symbols that the assembler reads to add them to the symbol table if they are used with this kind of fixup. Either way, I think we need the error handling added in the existing patch.

llvm/lib/MC/WinCOFFObjectWriter.cpp
953

Please add a test exercising this error case as well.

llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp
320

Please add a test exercising the error case, presumably using something like the b .Lsym + 4 pattern.

Also, I think you can make the diagnostic message clearer. Something like, "cannot perform a PC-relative fixup with a non-zero symbol offset" (emphasize that the offset from a symbol must be zero).

hjyamauchi updated this revision to Diff 550086.EditedAug 14 2023, 2:19 PM

Updated to treat all private linkage symbols as internal and add them to the symbol table, as Reid suggested. This fixes the issue with private linkage and the swift issue.

This also fixes the symbol class for private linkage for Windows/AArch64 from external to static to be in align with X86, and adds a test exercising the error case for the check in AArch64AsmBackend.

Please take another look.

I think that I generally prefer this approach. I would however like to see a test for the storage class changes (and ideally split from this change - it fixes a problem needed to fix here, but is a completely standalone improvement).

MaskRay added inline comments.Aug 15 2023, 7:24 PM
llvm/test/MC/AArch64/coff-relocations-branch26-bad.s
1 ↗(On Diff #550086)

In many occasions placing negative tests in the same file as the positive tests improves readability. You can use the --defsym ERR=1 pattern

18 ↗(On Diff #550086)

New tests are recommended to add line/column information to diagnostics, similar to test/MC/ELF/cfi-scope-errors.s

I think that I generally prefer this approach. I would however like to see a test for the storage class changes (and ideally split from this change - it fixes a problem needed to fix here, but is a completely standalone improvement).

Split the storage class change into https://reviews.llvm.org/D158122

rnk accepted this revision.Aug 16 2023, 4:23 PM

lgtm, thanks!

This revision is now accepted and ready to land.Aug 16 2023, 4:23 PM

Rebased over https://reviews.llvm.org/D158122. Addressed the comments on the test code.

llvm/test/MC/AArch64/coff-relocations-branch26-bad.s
1 ↗(On Diff #550086)

Done.

18 ↗(On Diff #550086)

Done.

This revision was landed with ongoing or failed builds.Aug 17 2023, 2:40 PM
This revision was automatically updated to reflect the committed changes.