This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Emit CFI instruction for updating x18 when using ShadowCallStack with exception unwinding
ClosedPublic

Authored by leonardchan on May 12 2020, 4:11 PM.

Details

Summary

PR45875 notes an instance where exception handling crashes on aarch64-fuchsia where SCS is enabled by default. The underlying issue seems to be that within libunwind, various _Unwind_* functions, the x18 register is not updated if a function is marked with nounwind. This removes the check for nounwind and emits the CFI instruction that updates x18.

Diff Detail

Event Timeline

leonardchan created this revision.May 12 2020, 4:11 PM
pcc added a comment.May 12 2020, 4:51 PM

It is unclear to me whether removing the nounwind condition is necessary. The wording in the langref talks about "asynchronous exceptions" but only provides semantics for "exception handling schemes that are recognized by LLVM to handle asynchronous exceptions, such as SEH". Since we don't support anything like that on non-Windows. I don't believe that we need to support throwing exceptions past such functions on DWARF platforms, and the unwind info can only be used for the purpose of creating a stack trace.

Furthermore, I believe that certain versions of libgcc will crash if they see this unwind info, so we can't emit it unconditionally.

When I wrote the commit message, the "currently required" referred to a hypothetical implementation of SCS that does not store return addresses on the regular stack but rather just the shadow stack. In that case, unwinding for stack traces would need to see this unwind info.

_Unwind_RaiseException should not have the nounwind attribute, since it can clearly be unwound past using synchronous exceptions. So maybe that should be fixed instead?

This is a direct revert of https://reviews.llvm.org/D54988. The rationale stated in that change was incorrect IMHO. When CFI is enabled, correct CFI is required, period. Making assumptions about how CFI will be used and thus when incomplete or incorrect CFI is adequate is a bad idea. The only proper distinctions are between no CFI at all, correct CFI at call return sites (-funwind-tables), and correct CFI at every instruction (-fasynchronous-unwind-tables). If generating CFI at all, then all CFI required to recover all caller registers not explicitly call-clobbered is required from at least every call return site in the function. As SCS CFI is necessary after the first or second prologue instruction, there is effectively no situation in which CFI omitting the x18 register is valid.

glibc on Linux and perhaps other platforms has long implemented pthread_cancel via unwinding, and long supported asynchronous unwinding generally for this. If LLVM intends to support the Linux glibc ABI, then it must support -fasynchronous-unwind-tables generally.

@pcc Do you happen to have a list of platforms that you suspect use libgcc and will break with this CFI? If this is a bug compatibility issue, we're thinking that instead of just disabling this for everyone, we should instead disable it for a blacklisted set of targets. So all targets by default would go through this branch except for those that actively crash from this.

@pcc Do you happen to have a list of platforms that you suspect use libgcc and will break with this CFI? If this is a bug compatibility issue, we're thinking that instead of just disabling this for everyone, we should instead disable it for a blacklisted set of targets. So all targets by default would go through this branch except for those that actively crash from this.

*ping* Do you have any more thoughts on this?

pcc accepted this revision.Jun 17 2021, 2:42 PM

I believe that Android used to use these older versions of libgcc, however Android has now moved to libunwind on aarch64 and although old libgcc versions may still be linked into applications, I do not believe that Android platform code compiled with SCS will run on the same thread as application code, so the old libgcc versions should not be exposed to this unwind info.

So the semantics of thread cancellation on glibc are that C++ destructors are called during cancellation? That's not required as far as I can tell, but I suppose that if someone adds SCS support to glibc and builds it with -ffixed-x18 then it should be possible for the destructors to run during cancellation.

So let's go with this I suppose and if it turns out that someone is using an incompatible unwinder then they can add an opt out for their platform.

This revision is now accepted and ready to land.Jun 17 2021, 2:42 PM