This is an archive of the discontinued LLVM Phabricator instance.

Increase stack size for stack-use-after-return test
ClosedPublic

Authored by fjricci on Feb 22 2017, 12:34 PM.

Event Timeline

fjricci created this revision.Feb 22 2017, 12:34 PM

Friendly ping

rnk edited edge metadata.Mar 1 2017, 8:12 AM

I'm a little concerned about changing this, I have a feeling 64K was taken from a real user of ASan that runs with 64K stacks. Can you elaborate on the issue you're seeing? Is it a stack overflow or something else?

There's a more detailed explanation in D29994, but essentially, some debugging found that 64k isn't actually being used for the stack size consistently, and the test never passes when 64k is actually used. (actual stack size used determined with pthread_attr_getstacksize)

rnk accepted this revision.Mar 1 2017, 2:12 PM

There's a more detailed explanation in D29994, but essentially, some debugging found that 64k isn't actually being used for the stack size consistently, and the test never passes when 64k is actually used. (actual stack size used determined with pthread_attr_getstacksize)

I noticed that all sanitizers other than ASan intercept pthread_create and call AdjustStackSize, but this test doesn't seem to relate to that. The stack limit was added in rL189457, which is from 2013. The comments don't suggest that there was any great reason to choose 64K. Go ahead and raise it, then.

This revision is now accepted and ready to land.Mar 1 2017, 2:12 PM
This revision was automatically updated to reflect the committed changes.
fjricci reopened this revision.Mar 1 2017, 4:26 PM

Reverted because this fails on aarch64. I don't have access to an aarch64 linux device, so I'm not sure the best way to debug.

Command Output (stderr):

1: 0x3ffec223620
2: 0x3ffec223620
/home/buildslave/buildslave/clang-cmake-aarch64-42vma/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc:52:13: error: expected string not found in input
 // THREAD: WRITE of size 1 {{.*}} thread T{{[1-9]}}
            ^
<stdin>:1:1: note: scanning from here
ASAN:DEADLYSIGNAL
^
<stdin>:11:5: note: possible intended match here
 #7 0x3ffb4f77083 in start_thread (/lib64/libpthread.so.0+0x7083)
    ^
This revision is now accepted and ready to land.Mar 1 2017, 4:26 PM
kubamracek edited edge metadata.Mar 1 2017, 4:37 PM

Link to the failure?

My theory is that the 64k size is so low that some platforms ignore the pthread_attr_setstacksize call. Now that we changed it to 128k, it actually triggers a problem where the ASan reporting facilities overflow the stack. Although it seems weird that ASan's error reporting would require 64k of stack...

Sorry for the delay, I've been going through a job change. Looks like the link is now dead: http://lab.llvm.org:8011/builders/clang-cmake-aarch64-42vma/builds/5218

The only error messages in that link were the ones I pasted though, plus an ASAN:DEADLYSIGNAL.

Just to test, I could make the stack even bigger, but it's tough to play around with this without having access to an aarch64 machine (I have an android one, but not a linux one).

fjricci added a comment.EditedMar 13 2017, 1:42 PM

I was able to run this on a linux aarch64 emulator, but I'm not able to reproduce the failure. I do still see the stack size flakiness (with observed stack size larger than it's set to be when the test passes) popping up. Wondering if it's worth disabling the particular test case that uses the small stack size, since it seems to be passing primarily due to this undefined behavior anyway.

fjricci requested review of this revision.Mar 13 2017, 1:43 PM
fjricci edited edge metadata.

The failure is weird. It's hard to believe that a 100kB stack is too small for ASan reporting machinery. Let's loop in more people: @kcc, @dvyukov, @eugenis, @filcab, @rSerge, @dberris, do you think you could help us here? Could you try reproducing the problem on a AArch64 device?

My assumption is that a repro of the failure will also require D29994, as that's the patch that seems to affect the flakiness.

I committed r298193, which checks the return value of pthread_attr_setstacksize and verifies that pthread_attr_getstacksize actually returns the correct size. Let's see if that triggers any failures.

r298195 changes the test to respect PTHREAD_STACK_MIN. On AArch64, this value is much larger than elsewhere, probably due to 64k page sizes (and PTHREAD_STACK_MIN needs to be at least 2 pages).

This revision was automatically updated to reflect the committed changes.

Looks like there are no failures this time. Can you try landing https://reviews.llvm.org/D29994 now?

Working on running the tests locally, I'll commit tomorrow if things pass.

Thanks for looking into it!