This is an archive of the discontinued LLVM Phabricator instance.

Add support for SEH unwinding on Windows.
ClosedPublic

Authored by cdavis5x on Aug 10 2018, 9:04 AM.

Details

Summary

I've tested this implementation on x86-64 to ensure that it works. All
libc++abi tests pass, as do all libc++ exception-related tests. ARM
and ARM64 still remain to be implemented (@compnerd?).

Special thanks to KJK::Hyperion for his excellent series of articles on
how EH works on x86-64 Windows. You can find it here:
http://www.nynaeve.net/?p=99. (Seriously, check it out. It's awesome.)

I'm actually not sure if this should go in as is. I particularly don't
like that I duplicated the UnwindCursor class for this special case.

Event Timeline

cdavis5x created this revision.Aug 10 2018, 9:04 AM

I've tested this implementation on x86-64 to ensure that it works. All libc++abi tests pass, as do all libc++ exception-related tests.

I tested it now as well, and seems to work for my simple testcase.

ARM still remains to be implemented (@compnerd?).

LLVM doesn't generate SEH unwind data for ARM at all yet. ARM64 implementation is ongoing at the moment though, see D50166 and D50288.

Special thanks to KJK::Hyperion for his excellent series of articles on how EH works on x86-64 Windows. (Seriously, check it out. It's awesome.)

Can you give some links to it? A brief googling didn't turn up much else than the PSEH library.

I'm actually not sure if this should go in as is. I particularly don't like that I duplicated the UnwindCursor class for this special case.

As I'm not totally familiar with how it works, I can only give cursory comments and compare to the libgcc implementation when trying to wrap my head around it, but this does seem way, way more complex than the libgcc version of the same.

As for the duplicated UnwindCursor, I'm thinking if it'd become more straightforward by avoiding touching those parts altogether, and just reimplementing all the other public functions, _Unwind_G/SetGR/IP, _Unwind_Backtrace etc, directly in Unwind-seh.cpp, and not care about using the libunwind.h APIs (unw_*) inbetween altogether? Or perhaps my libgcc comparison is making me biased.

src/Unwind-seh.cpp
54

I don't see any __libunwind_frame_consolidate anywhere, is this comment outdated?

src/UnwindLevel1.c
35

This probably works, but isn't #if !defined(_LIBUNWIND_SUPPORT_SEH_UNWIND) more of the common form?

src/libunwind_ext.h
44

What's this all about? winnt.h (from both MSVC and mingw-w64) should define this struct, no?

74

These are all both defined and called from Unwind-seh.cpp, so couldn't they just be static functions within there?

cdavis5x added inline comments.Aug 10 2018, 3:16 PM
src/libunwind_ext.h
44

Not my copy of <winnt.h> from the Win7 SDK. Dunno about WIn8 and WIn10.

cdavis5x added inline comments.Aug 10 2018, 3:25 PM
src/Unwind-seh.cpp
54

Why yes it is.

src/UnwindLevel1.c
35

I think I wrote this before the change to make that so took place, and I forgot to update this. Good catch.

src/libunwind_ext.h
74

Probably. Done.

cdavis5x updated this revision to Diff 160195.Aug 10 2018, 3:25 PM
  • Fix outdated comment.
  • Make preprocessor conditional more consistent.
  • Make some private functions used only in a single file static.
cdavis5x marked 6 inline comments as done.Aug 10 2018, 3:26 PM

Special thanks to KJK::Hyperion for his excellent series of articles on how EH works on x86-64 Windows. (Seriously, check it out. It's awesome.)

Can you give some links to it? A brief googling didn't turn up much else than the PSEH library.

I can't seem to find it either. I wonder if it's vanished from the Internet. Maybe the IA will have something.

rnk added inline comments.Aug 10 2018, 3:29 PM
src/libunwind_ext.h
44

If we have to pick between SDK versions to support, I'd much prefer to support 8+.

Could somebody verify that the DISPATCHER_CONTEXT struct is defined in <winnt.h> for the Win8 and Win10 SDKs? I can't install them right now.

rnk added a comment.Aug 14 2018, 10:00 AM

Could somebody verify that the DISPATCHER_CONTEXT struct is defined in <winnt.h> for the Win8 and Win10 SDKs? I can't install them right now.

I checked both, and they are available, so I think we should guard the definition like this:

// Provide a definition for _DISPATCHER_CONTEXT for Win7 and earlier SDK versions.
// Mingw64 has always provided this struct.
#if defined(_LIBUNWIND_SUPPORT_SEH_UNWIND) && !defined(__MINGW64__) && defined(WINVER) && WINVER < _WIN32_WINNT_WIN8
# if defined(__x86_64__)
typedef struct _DISPATCHER_CONTEXT { ... } DISPATCHER_CONTEXT;
# elif defined(__arm__)
typedef struct _DISPATCHER_CONTEXT { ... } DISPATCHER_CONTEXT;
# endif
#endif

Does that seem right? I'm not super familiar with SDK version detection, so I might have the wrong macros.

In D50564#1199285, @rnk wrote:

Could somebody verify that the DISPATCHER_CONTEXT struct is defined in <winnt.h> for the Win8 and Win10 SDKs? I can't install them right now.

I checked both, and they are available, so I think we should guard the definition like this:

// Provide a definition for _DISPATCHER_CONTEXT for Win7 and earlier SDK versions.
// Mingw64 has always provided this struct.
#if defined(_LIBUNWIND_SUPPORT_SEH_UNWIND) && !defined(__MINGW64__) && defined(WINVER) && WINVER < _WIN32_WINNT_WIN8
# if defined(__x86_64__)
typedef struct _DISPATCHER_CONTEXT { ... } DISPATCHER_CONTEXT;
# elif defined(__arm__)
typedef struct _DISPATCHER_CONTEXT { ... } DISPATCHER_CONTEXT;
# endif
#endif

Does that seem right? I'm not super familiar with SDK version detection, so I might have the wrong macros.

I believe you need to check VER_PRODUCTBUILD from <ntverp.h> if you are looking for the version of the Windows SDK that is being used to build the current application.

In D50564#1199285, @rnk wrote:

Could somebody verify that the DISPATCHER_CONTEXT struct is defined in <winnt.h> for the Win8 and Win10 SDKs? I can't install them right now.

I checked both, and they are available, so I think we should guard the definition like this:

// Provide a definition for _DISPATCHER_CONTEXT for Win7 and earlier SDK versions.
// Mingw64 has always provided this struct.
#if defined(_LIBUNWIND_SUPPORT_SEH_UNWIND) && !defined(__MINGW64__) && defined(WINVER) && WINVER < _WIN32_WINNT_WIN8

For the mingw check, I'd prefer checking for __MINGW32__, as the -64 version isn't defined when targeting 32 bit ARM (which uses SEH, even though everything isn't in place for it in LLVM).

And mingw-w64 might lack this struct for arm/arm64, but I can fix that and wouldn't bother with older versions of that.

And if these structs only are added for compat with the win7 sdk, maybe omit the arm one altogether, as all modern non-windows ce versions on arm is >= win8.

Special thanks to KJK::Hyperion for his excellent series of articles on how EH works on x86-64 Windows. (Seriously, check it out. It's awesome.)

Can you give some links to it? A brief googling didn't turn up much else than the PSEH library.

I can't seem to find it either. I wonder if it's vanished from the Internet. Maybe the IA will have something.

Found it: http://www.nynaeve.net/?p=99. Interestingly, a reference to that was in the MinGW headers--I think that's how I found it in the first place. I'll update the description.

In D50564#1199285, @rnk wrote:

Could somebody verify that the DISPATCHER_CONTEXT struct is defined in <winnt.h> for the Win8 and Win10 SDKs? I can't install them right now.

I checked both, and they are available, so I think we should guard the definition like this:

// Provide a definition for _DISPATCHER_CONTEXT for Win7 and earlier SDK versions.
// Mingw64 has always provided this struct.
#if defined(_LIBUNWIND_SUPPORT_SEH_UNWIND) && !defined(__MINGW64__) && defined(WINVER) && WINVER < _WIN32_WINNT_WIN8
# if defined(__x86_64__)
typedef struct _DISPATCHER_CONTEXT { ... } DISPATCHER_CONTEXT;
# elif defined(__arm__)
typedef struct _DISPATCHER_CONTEXT { ... } DISPATCHER_CONTEXT;
# endif
#endif

Does that seem right? I'm not super familiar with SDK version detection, so I might have the wrong macros.

I believe you need to check VER_PRODUCTBUILD from <ntverp.h> if you are looking for the version of the Windows SDK that is being used to build the current application.

Thanks, @rnk and @zturner.

cdavis5x updated this revision to Diff 160638.Aug 14 2018, 10:46 AM
cdavis5x marked 3 inline comments as done.
cdavis5x edited the summary of this revision. (Show Details)
  • Update checks for DISPATCHER_CONTEXT definition.
  • Add link to KJK::Hyperion's articles on SEH.

Marked some comments as done. (Phab won't let me post an empty comment, so...)

...And it turns out Phab marked the comments as done when I uploaded the new change. I didn't know it would do that. That's useful to know.

cdavis5x updated this revision to Diff 160660.Aug 14 2018, 12:01 PM
  • Get rid of DISPATCHER_CONTEXT def for ARM. We only support ARM on Win8+ anyway.

I'm all for this change except the core issue is that you're using libunwind as a shim around the actual unwinding API provided by Windows. It would be nice to have something that did not have to do that and was capable of performing unwinding of SEH-style exceptions without needing additional runtime support.

I'm all for this change except the core issue is that you're using libunwind as a shim around the actual unwinding API provided by Windows. It would be nice to have something that did not have to do that and was capable of performing unwinding of SEH-style exceptions without needing additional runtime support.

It would be nice, but that would require extra work. We'd have to implement reading and interpreting unwind codes, and calling any handlers present at each frame (which all have a different calling convention from Itanium handlers), and handling chained unwind info... Or we could use the implementation that MS provided to us for free--and which gets loaded into every process anyway by virtue of being in NTDLL, and which is extremely well tested. Given all that, I'm wondering what implementing all that ourselves would gain us. I suppose we could eventually do all that, but for now, I think this is outside the scope of my change.

I'm all for this change except the core issue is that you're using libunwind as a shim around the actual unwinding API provided by Windows. It would be nice to have something that did not have to do that and was capable of performing unwinding of SEH-style exceptions without needing additional runtime support.

It would be nice, but that would require extra work. We'd have to implement reading and interpreting unwind codes, and calling any handlers present at each frame (which all have a different calling convention from Itanium handlers), and handling chained unwind info... Or we could use the implementation that MS provided to us for free--and which gets loaded into every process anyway by virtue of being in NTDLL, and which is extremely well tested. Given all that, I'm wondering what implementing all that ourselves would gain us. I suppose we could eventually do all that, but for now, I think this is outside the scope of my change.

+1. I guess such a library would be very nice to have, but from the point of view of implementing exception handling, using the underlying APIs probably is the way to go. The other question, as posted before, is whether we want to wrap the whole CONTEXT structs in the UnwindCursor class and expose it via the unw_* set of APIs. It gives quite a significant amount of extra code here compared to libgcc's unwind-seh.c which is <500 loc altogether, providing only the _Unwind_* API, implementing it directly with the Windows Rtl* APIs from ntdll. That would give only a partial libunwind, with only the higher level API available, but that's the only part used for exception handling at least.

kristina added a comment.EditedAug 21 2018, 6:52 AM

I'm all for this change except the core issue is that you're using libunwind as a shim around the actual unwinding API provided by Windows. It would be nice to have something that did not have to do that and was capable of performing unwinding of SEH-style exceptions without needing additional runtime support.

It would be nice, but that would require extra work. We'd have to implement reading and interpreting unwind codes, and calling any handlers present at each frame (which all have a different calling convention from Itanium handlers), and handling chained unwind info... Or we could use the implementation that MS provided to us for free--and which gets loaded into every process anyway by virtue of being in NTDLL, and which is extremely well tested. Given all that, I'm wondering what implementing all that ourselves would gain us. I suppose we could eventually do all that, but for now, I think this is outside the scope of my change.

+1. I guess such a library would be very nice to have, but from the point of view of implementing exception handling, using the underlying APIs probably is the way to go. The other question, as posted before, is whether we want to wrap the whole CONTEXT structs in the UnwindCursor class and expose it via the unw_* set of APIs. It gives quite a significant amount of extra code here compared to libgcc's unwind-seh.c which is <500 loc altogether, providing only the _Unwind_* API, implementing it directly with the Windows Rtl* APIs from ntdll. That would give only a partial libunwind, with only the higher level API available, but that's the only part used for exception handling at least.

That still makes the underlying assumption that SEH is exclusive to Windows (and by that virtue PE/COFF) which doesn't necessarily hold true. There is also the issue of generic names like _LIBUNWIND_SUPPORT_SEH_UNWIND to guard includes of Windows specific headers which ties into my previous point. Would it be possible to somehow abstract away the Windows runtime function call and Windows specific header includes, even if it's via CMake options.

I'm raising this issue as there are other implementations of SEH that share the same (or extremely similar, depending on SDK versions) structures, but would not have the Windows headers available when compiling libunwind, and while the runtime function call to the Rtl* API is easy to change later on, separating rest of it from Windows headers is slightly messier.

Would it, at very least, be possible to decouple most of it from a dependency on Windows headers and if possible use things like void* in place of PVOID or anything that's stdint.h/inttypes.h friendly? It's an excellent patch regardless but loosening the Windows SDK dependencies a bit would make it a lot better.

Would be much appreciated, as I'm going through a backlog of patches I'd like to put up for review one of which includes SEH for x86_64 Linux (ELF) implemented using RT signals and additional compiler runtime glue code, currently residing within compiler-rt builtins (one of the issues I want to address before). It's quite a wonky ABI since I tried to keep as much compatible with 64-bit Windows SEH (.xdata variation) in terms of frame lowering etc, but run-time mechanisms differ vastly for obvious reasons. Having it interop with libunwind smoothly without rewriting much of existing code in this patch would be nice, or even worse resorting to a million preprocessor guards to cover all the cases.

I'm all for this change except the core issue is that you're using libunwind as a shim around the actual unwinding API provided by Windows. It would be nice to have something that did not have to do that and was capable of performing unwinding of SEH-style exceptions without needing additional runtime support.

It would be nice, but that would require extra work. We'd have to implement reading and interpreting unwind codes, and calling any handlers present at each frame (which all have a different calling convention from Itanium handlers), and handling chained unwind info... Or we could use the implementation that MS provided to us for free--and which gets loaded into every process anyway by virtue of being in NTDLL, and which is extremely well tested. Given all that, I'm wondering what implementing all that ourselves would gain us. I suppose we could eventually do all that, but for now, I think this is outside the scope of my change.

+1. I guess such a library would be very nice to have, but from the point of view of implementing exception handling, using the underlying APIs probably is the way to go. The other question, as posted before, is whether we want to wrap the whole CONTEXT structs in the UnwindCursor class and expose it via the unw_* set of APIs. It gives quite a significant amount of extra code here compared to libgcc's unwind-seh.c which is <500 loc altogether, providing only the _Unwind_* API, implementing it directly with the Windows Rtl* APIs from ntdll. That would give only a partial libunwind, with only the higher level API available, but that's the only part used for exception handling at least.

That still makes the underlying assumption that SEH is exclusive to Windows (and by that virtue PE/COFF) which doesn't necessarily hold true. There is also the issue of generic names like _LIBUNWIND_SUPPORT_SEH_UNWIND to guard includes of Windows specific headers which ties into my previous point. Would it be possible to somehow abstract away the Windows runtime function call and Windows specific header includes, even if it's via CMake options.

I'm raising this issue as there are other implementations of SEH that share the same (or extremely similar, depending on SDK versions) structures, but would not have the Windows headers available when compiling libunwind, and while the runtime function call to the Rtl* API is easy to change later on, separating rest of it from Windows headers is slightly messier.

Would it, at very least, be possible to decouple most of it from a dependency on Windows headers and if possible use things like void* in place of PVOID or anything that's stdint.h/inttypes.h friendly? It's an excellent patch regardless but loosening the Windows SDK dependencies a bit would make it a lot better.

Would be much appreciated, as I'm going through a backlog of patches I'd like to put up for review one of which includes SEH for x86_64 Linux (ELF) implemented using RT signals and additional compiler runtime glue code, currently residing within compiler-rt builtins (one of the issues I want to address before). It's quite a wonky ABI since I tried to keep as much compatible with 64-bit Windows SEH (.xdata variation) in terms of frame lowering etc, but run-time mechanisms differ vastly for obvious reasons. Having it interop with libunwind smoothly without rewriting much of existing code in this patch would be nice, or even worse resorting to a million preprocessor guards to cover all the cases.

Very well, then. I'll see what I can do.

cdavis5x updated this revision to Diff 161807.Aug 21 2018, 1:23 PM
  • Make a bit more friendly to non-Windows.
  • Fix bits that depend on D50413.

OK, I've removed some of the dependencies on Windows-specific types. The code in Unwind-seh.cpp is very Windows-specific, though, so I left it alone.

OK, I've removed some of the dependencies on Windows-specific types. The code in Unwind-seh.cpp is very Windows-specific, though, so I left it alone.

Much appreciated, thank you for your work. LGTM in terms of semantics.

mstorsjo added inline comments.Aug 23 2018, 4:51 AM
src/Unwind-seh.cpp
164

Without understanding the code flow completely - is there a risk that this tries to use an uninitialized cursor, in case we hit ctx = (struct _Unwind_Context *)ms_exc->ExceptionInformation[1]; above and never initialized the local cursor?

175

Who will get this new uninitialized CONTEXT here, and will they try to use it for something?

src/UnwindCursor.hpp
31

Hmm, I think this one is supposed to be arch independent. Although it's easy to remove the ifdef once we want to use it for anything else than x86_64...

54

This looks like another leftover; there's no libunwind_ext.h any longer here.

482

Maybe skip the i386 ifdefs here, to keep it concise, as we don't expect to build this with __SEH__ defined for i386.

625

If this constructor is unused and unimplemented, I'd omit it altogether.

1195

What does this whole block do here? Isn't this within an !SEH ifdef block? The same methods are declared for the SEH version of this struct further above.

1827

I can't say I understand all of this yet, but I'm slowly getting there, and I'm trying to compare this to what libgcc does.

libgcc doesn't use any definition of UNWIND_INFO and doesn't need to do the equivalent of getInfoFromSEH, used by step(), anywhere. unw_step() is used in _Unwind_ForcedUnwind, which in libgcc is implemented using RaiseException (STATUS_GCC_FORCED, ....

I guess if you happen to have all of the unw_step API available, it's easier to just do it like this, in custom code without relying on the NTDLL functions for it, while the libgcc version relies more on the NTDLL API.

kristina added inline comments.Aug 23 2018, 7:42 AM
src/UnwindCursor.hpp
1827

This primarily deals with the SEH exceptions re-purposed as a C++ exception mechanism on x86_64 (if I understood this right), it's possible to set a custom filter using a runtime call so I suspect GCC does that or defines a translation function (also via a runtime call) which acts as a filter for "true" SEH exceptions behind the scenes deep within the runtime. Typially "true" SEH exceptions don't, outside of runtime magic, play nicely with C++ exceptions, with the __C_specific_handler ones being a completely different paradigm that falls far outside the scope of libunwind (those ones being the "true"/explicit SEH exceptions).

(Don't take my word for it, it's been a while and I only implemented the "true" variation for 64-bit Linux by reserving some RT signals and using that to invoke additional runtime glue that would then do the unwinding, completely ignoring DWARF since CFI exceptions and SEH exceptions really don't mix especially on platforms that are not Windows-like)

cdavis5x updated this revision to Diff 162250.Aug 23 2018, 12:44 PM
cdavis5x marked 7 inline comments as done.
  • Remove unnecessary code.
  • Guard against rogue personality functions returning wrong values.
  • Add a comment clarifying the purpose of the new_ctx variable.
cdavis5x marked 2 inline comments as done.Aug 23 2018, 12:45 PM
cdavis5x added inline comments.
src/Unwind-seh.cpp
164

We should only hit this point if we're in phase 2 (i.e. IS_UNWINDING(exc->ExceptionFlags) is true). But you've now got me worried about the potential for a rogue personality function surreptitiously returning this to exploit libunwind. I've added a check here.

175

It won't be uninitialized. RtlUnwindEx() always starts unwinding from the current frame; it fills in the passed in CONTEXT and passes its address to any handlers it encounters along the way. In other words, it's there so RtlUnwindEx() has a place to keep track of the unwind context.

src/UnwindCursor.hpp
1195

Right now, it doesn't do anything. I added it so @kristina has someplace to put the code for unwinding with SEH data when we're not on Windows and we don't have the RtlUnwindEx() function available. You'll note that the '!SEH' ifdef now says:

#if defined(_LIBUNWIND_SUPPORT_SEH_UNWIND) && defined(_WIN32)
1827

Actually, it's just to implement the lower-level libunwind APIs--specifically, unw_get_info(). The code @kristina is talking about is all in Unwind-seh.cpp.

Not much more comments from me. The implementation seems reasonable, and works for one simple test I did (with an earlier revision of the patch at least), and further refinement can happen in-tree I guess.

I'd like to have someone else (@rnk @compnerd?) give it a more proper approval though, at least a general ack for the style/structure.

include/__libunwind_config.h
66

I don't think the __ARM_WMMX case here is relevant; there are no ARM CPUs with WMMX running modern windows on arm, afaik (and the size number here I presume only is a copy from the one below); sorry for not pointing it out earlier.

src/UnwindCursor.hpp
54

Sorry I think I misread and looked for libunwind_ext.h in the include dir only. But in case it isn't really needed here, keep the old libunwind.h include at least, in order not to break other potential build configurations that might need it.

1195

Ah, I see. Maybe a comment clarifying that here as well?

I'd say LGTM since it's an introduction of any sort of runtime within the LLVM project scope that deals with SEH specifically. So far all the published code is pretty much exclusively related to Clang/LLVM IR and MC support for codegen of SEH related code, but with an implicit reliance on core Windows runtime and other goodies being around. Even if not super polished, as long as it does not cause regressions in other parts of libunwind, it's a start nontheless.

cdavis5x updated this revision to Diff 162441.Aug 24 2018, 12:25 PM
cdavis5x marked 8 inline comments as done.
  • Remove case we'll never realistically hit.
  • Add back include I removed earlier.
  • Add a comment clarifying that some environments use SEH without Windows runtime support.
cdavis5x updated this revision to Diff 162509.Aug 24 2018, 4:49 PM
  • Move DISPATCHER_CONTEXT definition into UnwindCursor.hpp.
rnk accepted this revision.Aug 30 2018, 2:26 PM

Not much more comments from me. The implementation seems reasonable, and works for one simple test I did (with an earlier revision of the patch at least), and further refinement can happen in-tree I guess.

I'd like to have someone else (@rnk @compnerd?) give it a more proper approval though, at least a general ack for the style/structure.

Thanks, I found some time and looked through it. EH is super complicated, and I find that I'm not able to give meaningful feedback about how to simplify this code. In any case, I think we should go ahead and commit this and get some feedback from users.

Thanks for putting it together!

This revision is now accepted and ready to land.Aug 30 2018, 2:26 PM
This revision was automatically updated to reflect the committed changes.