This is an archive of the discontinued LLVM Phabricator instance.

tsan: automatically deflake flaky tests
ClosedPublic

Authored by dvyukov on May 26 2014, 7:33 AM.

Details

Reviewers
kcc
samsonov
Summary

Add a script that is used to deflake inherently flaky tsan tests.
It is invoked from lit tests as:
$(dirname %s)/deflake.bash %run %t %s
The script runs the target program up to 10 times,
until it produces the necessary output.

If/when it LGTM, I will add it to other tests as well.

Diff Detail

Event Timeline

dvyukov updated this revision to Diff 9809.May 26 2014, 7:33 AM
dvyukov retitled this revision from to tsan: automatically deflake flaky tests.
dvyukov updated this object.
dvyukov edited the test plan for this revision. (Show Details)
dvyukov added reviewers: kcc, samsonov.
dvyukov added a subscriber: Unknown Object (MLST).
kcc added inline comments.May 26 2014, 7:40 AM
test/tsan/deflake.bash
11

some tests runs FileCheck with more than one parameter

dvyukov updated this revision to Diff 9840.May 27 2014, 8:30 AM

allow custom FileCheck args

samsonov added inline comments.May 27 2014, 12:28 PM
test/tsan/deflake.bash
9

What about the %run argument? If it is expanded to anything non-empty, this script won't work.

test/tsan/free_race.c
2

You can add a lit substitution to test/tsan/lit.cfg and instead invoke this "tool" as smth. like %tsan_deflake

dvyukov added inline comments.May 27 2014, 1:53 PM
test/tsan/deflake.bash
9

Can it be expanded to anything non-empty?
I guess it's not possible to pass some abstract way of running a program into a shell script.

test/tsan/free_race.c
2

will try to figure out how to do it

samsonov added inline comments.
test/tsan/deflake.bash
9

Well, it can, otherwise %run wouldn't be there. Although, I never tried it (by setting config.emulator). Adding Greg for comments on this.

Why is the test flakey?

The tests test that ThreadSanitizer finds the data race in particular conditions. However, ThreadSanitizer core algorithm can miss a data race when the racy memory access attempts happen very close to each other (literally simultaneously). This was done intentionally, fixing this would impose significant slowdown and this is not a problem for programs other than unit tests.
So the unit tests suffer from this aspect.

I see, thanks. How about something like:

%try 10 not %run %t | FileCheck %s

Meaning, try $* up to 10 times, capturing stdout/stderr each time. When $* returns zero, write its output to stdout/stderr.

Looks good to me.

It does not handle the case when a single test expects several reports. I've found 2 such existing tests:
test/tsan/global_race.cc
test/tsan/inlined_memcpy_race.cc
But both of them just checking several independent things. So I guess if it becomes a problem, we can split them into several tests.

Do you have any idea how to implement %try? Because I don't.

Do you have any idea how to implement %try? Because I don't.

For portability, I'd write it in Python using the subprocess module, ensuring shell=False.

What? What does it mean to "write %try"? How do I do it and where?

You replace "%try" with a call to your script using config.substitutions in "<compiler-rt>/test/tsan/lit.cfg".

dvyukov updated this revision to Diff 9910.May 29 2014, 2:13 AM
dvyukov edited edge metadata.

introduce %deflake macro

I've created %deflake macro.
I've named it so that it's purpose is clear, while for %try it's not obvious.
It's still bash. I don't know Python. Can write in Go or C, if you wish :)

samsonov accepted this revision.May 29 2014, 10:35 AM
samsonov edited edge metadata.

LGTM with a nit

test/tsan/lit.cfg
60
os.path.join(os.path.dirname(__file__), "deflake.bash")
This revision is now accepted and ready to land.May 29 2014, 10:35 AM

How about: %S/deflake.bash %run %t | FileCheck %s

All others equal I would prefer to stick with already implemented suggestion about lit.cfg macro.

os.path.join(os.path.dirname(file), "deflake.bash")

Done

Committed in rev 209898.