This is an archive of the discontinued LLVM Phabricator instance.

SafeStack: Fix flaky test (PR39001)
ClosedPublic

Authored by vlad.tsyrklevich on Sep 20 2018, 4:12 PM.

Details

Summary

pthread_join() can return before a thread finishes exit()ing in the
kernel and a subsequent tgkill() can report the thread still alive.
Update the pthread-cleanup.c test to sleep and retry if it hits this
possible flake.

Thanks to Jeremy Morse for reporting.

Diff Detail

Event Timeline

Herald added subscribers: Restricted Project, llvm-commits, jfb, delcypher. · View Herald TranscriptSep 20 2018, 4:12 PM
vitalybuka accepted this revision.Sep 20 2018, 5:20 PM
vitalybuka added inline comments.
test/safestack/pthread-cleanup.c
38–52

Why just 3?

This revision is now accepted and ready to land.Sep 20 2018, 5:20 PM
jmorse accepted this revision.Sep 21 2018, 3:15 AM

LGTM -- obviously this isn't an ideal situation, but the pthreads/OS-threads abstraction isn't supposed to be un-peeled anyway.

This doesn't trigger the fault in my test setup after ~30 mins, alas I don't have time for a longer soak test.

test/safestack/pthread-cleanup.c
38–52

This test seems to only be flaky under high load when one thread runs immediately after the other. 3 seconds of wall time seemed like a reasonably high enough amount of time that the first thread would have been able to finish exiting in the kernel (without making this test very laggy under circumstances where it would actually fail for some reason.)

This revision was automatically updated to reflect the committed changes.