This is an archive of the discontinued LLVM Phabricator instance.

[Win64EH] Write .pdata symbol relocations relative to the temporary begin symbol
ClosedPublic

Authored by mstorsjo on Sep 11 2021, 12:28 PM.

Details

Summary

Previously the relocations pointed at the public user facing,
possibly external symbol.

When the function itself is weak, that symbol may be overridden at
link time, pointing at another strong implementation of the same
function instead. In that case, there's two conflicting pdata entries
pointing at the same address, and the wrong unwind info might end up
used.

Both GCC/binutils and MSVC produce pdata pointing at internal static
symbols. (GCC/binutils point at the .text section just as LLVM does
after this change, MSVC points at special label type symbols with the
type IMAGE_SYM_CLASS_LABEL and names like '$LN4'.)

This fixes unwinding through an overridden "operator new" with a
statically linked C++ library in MinGW mode. (Building libc++ with
-ffunction-sections and linking with --gc-sections might avoid the
issue too.)

This makes the produced object files a little less user friendly
to debug, but with the other llvm-readobj patches, the unwind info
debugging experience should be pretty much the same.

Alternatively, we could choose to only do this if the function is
marked as weak - producing less consistent output but more
straightforward object files in most cases.

Diff Detail

Event Timeline

mstorsjo created this revision.Sep 11 2021, 12:28 PM
mstorsjo requested review of this revision.Sep 11 2021, 12:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 11 2021, 12:28 PM
efriedma accepted this revision.Sep 13 2021, 11:09 AM

Makes sense. LGTM

llvm/test/MC/COFF/seh.s
155

This is just to make the CHECK lines easier to read, I assume?

This revision is now accepted and ready to land.Sep 13 2021, 11:09 AM
mstorsjo added inline comments.Sep 13 2021, 11:39 AM
llvm/test/MC/COFF/seh.s
155

Yes; currently they check that the StartAddress and EndAddress point at the same symbol, but without the nop, the EndAddress of func would be resolved by llvm-readobj into smallFunc instead of func + offset - otherwise I'd need to change those CHECK lines.

I guess the alternative would be to add even more logic into the resolving logic in llvm-readobj, see D109649, to not resolve symbol + nonzero offset into a symbol with a zero offset, for the EndAddress case.

mstorsjo added inline comments.Sep 13 2021, 12:18 PM
llvm/test/MC/COFF/seh.s
155

I amended D109649, with that update I can drop this extra nop.