This is an archive of the discontinued LLVM Phabricator instance.

[libc] Unblock SIGABRT before raising in abort
AbandonedPublic

Authored by abrachet on Mar 24 2020, 1:38 AM.

Details

Summary

Now that signal support is more flushed out it is possible to unblock SIGABRT before raising it and setting its action to default before raising a second time.

Diff Detail

Event Timeline

abrachet created this revision.Mar 24 2020, 1:38 AM
abrachet marked an inline comment as done.Mar 24 2020, 1:47 AM
abrachet added inline comments.
libc/src/stdlib/abort.cpp
23–24

The previous comment mentioned locks, it's a little more complicated than I originally imagined. We would also need locks in sigaction to ensure if one thread is aborting no others can change SIGABRT's action. Even with the recent mutex patch they still cannot be used because as previously explained in the comment we need recursive mutexes. I could add a comment as a TODO here (and in sigaction) but I wonder how much we care about this behavior, so I have waited before adding such a comment.

sivachandra marked an inline comment as done.Mar 26 2020, 5:05 PM
sivachandra added inline comments.
libc/src/stdlib/abort.cpp
23–24

I agree. May be you can add a comment saying we will not worry about MT-safe behavior for now? And also explain how it should all work when we have the recursive mutex available?

27

Is an unblock required before the first raise? If the application blocked it, the first raise would just return, no? May be this is related to my other comment below.

35

Can this unblock be moved out of this if statement, and the above unblock be removed?

38

Why not use sigaction?

libc/test/src/stdlib/CMakeLists.txt
26

I think I have an idea on how to do this. But, what you have is also OK in the meanwhile.

PaulkaToast added inline comments.Mar 26 2020, 5:49 PM
libc/src/stdlib/abort.cpp
23–24

I agree that the recursive case is much more likely to occur. I don't feel there is a need to complicate with the multi-thread cases. The only difference I see it would exit with a SIGKILL instead of a SIGABRT, right?

27

The standard doesn't specify if we should unblock, so it might make our code simpler to remove this line. Then maybe the recursive situation is handled without the need for the if (!called) because the raise function blocks all incoming signals?

38

I think it would be nice to have a comment here explaining the reason why we install the default sigabrt handler after raising sigabrt above.

Maybe even quote the relevant part of the standard?

The abort function causes abnormal program termination to occur, unless the signal SIGABRT is being caught and the signal handler does not return

ie this is handling the case where the SIGABRT is caught but the handler does return, so we must terminate the program.

44–47

Aside from the multi-thread case, I think there is a single thread re-entry issue here. A signal can be raised in between the execution of this line and the previous line (e.g. from an external program or alarm) that installs the default handler. A possible solution might be to block all signals for this segment?

48

I think add a comment to explain this is our last ditch effort to kill the program if all other attempts have failed.

libc/test/src/stdlib/abort_test.cpp
54

I'd like to see a test-case for the single thread re-entry issue, however I'm not entirely sure if that can be reliably tested. Mind looking into it? If its not possible that's fine. (:

sivachandra added inline comments.Mar 26 2020, 9:39 PM
libc/src/stdlib/abort.cpp
44–47

That's a good point. We can get creative with some locking may be to test this?

abrachet updated this revision to Diff 253167.Mar 27 2020, 10:37 AM
  • Block all signals except for SIGABRT
  • Add more comments
abrachet marked 7 inline comments as done.Mar 27 2020, 11:15 AM
abrachet added inline comments.
libc/src/stdlib/abort.cpp
23–24

I've added a comment. Not sure if its sufficient.

27

I think that the first unblock is useful in case it is currently blocked and there is a user handler installed. Otherwise we would make the first raise, it does nothing, we unblock then set its handler to the default and we die to that and skip over the user handler.

I've taken @PaulkaToast's suggestion about async signals into account and I just block all except SIGABRT here so this block now both blocks all and unblocks SIGABRT.

35

I think this is useful here in the case that the user handler changes the signal mask and also returns. I could also move it out if it seems more simple, but I think this is only meaningful when we go into this if statement because thats the only time the signal disposition could change (except for other threads changing it).

38

Why not use sigaction?

Our internal use of sigaction is pretty complicated we have the internal sigaction macro which needs to be defined and I think its less complicated to just use signal here. I could use sigaction here if you think it is preferable though.

44–47

Aside from the multi-thread case, I think there is a single thread re-entry issue here. A signal can be raised in between the execution of this line and the previous line (e.g. from an external program or alarm) that installs the default handler. A possible solution might be to block all signals for this segment?

Interesting, I hadn't thought of that :).

Signals are very poorly designed in my opinion (Fuchsia even omits them). We just cant guard against this easily because they are process wide. I've changed from unblocking SIGABRT at the beginning to blocking all except SIGABRT to try and not get interrupted by async signals. But another thread could change this unfortunately. We could (but I think maybe we should not) use locks here which sigaction must acquire before changing the signal mask to make sure once we enter abort no other thread can touch signals. At this point though, so many things must come together for this to happen that someone would have to be trying, and they could just make the sigaction syscall without the libc wrapper and go past our locks :P.

Like you mentioned above, worst case here is that the program dies to SIGKILL not SIGABRT. Of course I'll implement whatever we collectively think is necessary, though.

That's a good point. We can get creative with some locking may be to test this?

I might not be creative enough but the only way I could think to deterministically test this is with lldb in a lit style test :)

abrachet updated this revision to Diff 253717.Mar 30 2020, 3:19 PM

Don't block before first raise call, only once.

Looking at this now.

sivachandra accepted this revision.Apr 2 2020, 10:16 PM
sivachandra marked an inline comment as done.
sivachandra added inline comments.
libc/src/stdlib/abort.cpp
44–47

I think the truly correct way would be to use a recursive lock. Until then, this seems fine.

libc/test/src/stdlib/CMakeLists.txt
26

Can you make this TODO (sivachandra): ... so that I it would be easy for me to look it up?

I started working on fixing this, but then I found one bug (https://reviews.llvm.org/D77256), and then also realized I should check if we want to do this: https://reviews.llvm.org/D77340. If we decide to keep the latter, I prefer to put in first before I make any new changes to tracking deps.

This revision is now accepted and ready to land.Apr 2 2020, 10:16 PM
MaskRay requested changes to this revision.Apr 3 2020, 9:09 AM
MaskRay added inline comments.
libc/src/stdlib/abort.cpp
26

The function-local variable is not needed. If the signal handler of SIGABRT is set to abort, the following blocking logic should work.

This revision now requires changes to proceed.Apr 3 2020, 9:09 AM
MaskRay added inline comments.Apr 3 2020, 9:15 AM
libc/src/stdlib/abort.cpp
34

Another thread may concurrently install a signal handler for SIGABRT or unblock SIGABRT. There needs to be a lock.

abrachet marked 2 inline comments as done.Apr 3 2020, 9:39 AM
abrachet added inline comments.
libc/src/stdlib/abort.cpp
26

On MacOS

::signal(SIGABRT, +[] (int) { ::abort(); });
::abort();

This dies to SIGSEGV, presumably from a stack overflow. This is not ideal.

34

Yes this was discussed, we need a recursive lock here but also in sigaction, we don't have those yet. I don't personally think the benefits are worth that complexity anyway though.

MaskRay added inline comments.Apr 3 2020, 10:02 AM
libc/src/stdlib/abort.cpp
34

This suggests that the patches should be applied in a better way. If a mutex is implemented before improving abort, then we don't need to care about this.

I think a non-recursive mutex suffices here.

sivachandra added inline comments.Apr 3 2020, 10:27 AM
libc/src/stdlib/abort.cpp
34

Perfect is the enemy of the good: https://en.wikipedia.org/wiki/Perfect_is_the_enemy_of_good
FWIW, this albeit incomplete implementation allowed us to make progress with other pieces. Not to claim other pieces are complete or perfect. As I said in another post, the quest for perfection hinders progress, and more importantly, hinders our own learning.

MaskRay added inline comments.Apr 3 2020, 1:27 PM
libc/src/stdlib/abort.cpp
34

I am not sure this is the excuse to check in the imperfect code at this point. Actually the simply abort() is sufficient for 99.9% use cases. Nobody will sigaction SIGABRT to abort itself. When the other infrastructure is not ready, it is premature to over-optimize abort. You can disregard it but that will eventually clutter up the history of llvm-libc.

A messy history will just hinder learning for posterity, which may be a lot of people.

sivachandra added inline comments.Apr 3 2020, 2:47 PM
libc/src/stdlib/abort.cpp
34

You seem to argue for perfection and imperfection at the same time (because this patch is trying to reduce the existing imperfection). It is probably getting into subjective opinions at this point so I am going to let @alexbrachet decide if they want to submit it as is, or if they want to wait for a recursive mutex implementation. I am OK with either approach as I don't think "clutter up history" is an issue.

abrachet marked an inline comment as done.Apr 3 2020, 3:24 PM
abrachet added inline comments.
libc/src/stdlib/abort.cpp
34

I could add a lock here and make sigaction and sigprocmask acquire it. But I really wonder if it is worth the added complexity to handle this case. Also, what if the user handler longjmps? Then the next call to abort will be a deadlock too. I don't think there is 0 tradeoff here, it isn't obvious as you make it seem that we must handle this case. The standard says very little about what is required here. @MaskRay if you feel strongly then I will add it, its a trivial change on my end.

abrachet abandoned this revision.Apr 18 2022, 8:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 18 2022, 8:49 AM