This is an archive of the discontinued LLVM Phabricator instance.

[libFuzzer] [Tests] [NFC] Change seed for reduce_inputs.test
ClosedPublic

Authored by george.karpenkov on Jun 27 2018, 5:07 PM.

Details

Summary

On one of our bot machines, -seed=1 gets unlucky and does not finish in time.
While it makes for a great question why (from what I gather, std::mt19937 should be identical on both platforms) for now I would just like to keep the bots green.

Diff Detail

Repository
rL LLVM

Event Timeline

morehouse accepted this revision.Jun 27 2018, 5:20 PM
This revision is now accepted and ready to land.Jun 27 2018, 5:20 PM

@kcc @morehouse Any ideas on what could cause discrepancies between platforms? I've manually checked the output of the random generator with the same seed on all machines in question, and it's the same. libFuzzer performs it's own hash computation, so that should be stable as well. Yet we have a machine which behaves differently with this seed.

I'm also surprised how does the first line run, since it's not seeded at all -- shouldn't it not find the input every now and then?

Does that machine build libFuzzer differently? Maybe the version of libc++ it uses is different.

Maybe the version of libc++ it uses is different

Yeah that's what I've thought, but I've tried using the prime generator directly with the freshly built clang on that machine, and got the same sequence as elsewhere.
They all use fresh libcxx.

kcc added a comment.Jun 27 2018, 5:36 PM

oh, anything could be different. I wouldn't expect seed=1 to behave the same on different platforms.
Need to check why this test is flaky (too few iterations?)

@kcc Any sources of non-determinism you suspect? 10^6 iterations already take ~20 seconds, would be hesitant to bump it more.

Also it consistently finds the issue if (even empty) corpus is supplied. That's expected due to the limit of corpus data which would be stored in RAM?

This revision was automatically updated to reflect the committed changes.
kcc added a comment.Jun 27 2018, 5:52 PM

Any sources of non-determinism you suspect?

RNG is expected to provide the same values, but if e.g. the code is compiled slightly differently because
the system headers are different, then the coverage feedback will be different and hence the RNG will be called in different order.

This is very strange.
I've just ran the test manually for a few times with different seeds and it always finds the bug quickly.

clang -g -std=c++11 -fsanitize=address,fuzzer ~/llvm/projects/compiler-rt/test/fuzzer/ShrinkControlFlowSimpleTest.cpp
./a.out -exit_on_item=0eb8e4ed029b774d80f2b66408203801cb982a60 -runs=1000000 -jobs=100
kcc added a comment.Jun 27 2018, 5:54 PM

also:

>> 10^6 iterations already take ~20 seconds, would be hesitant to bump it more.

what?

for me this test completes in < 1 second even if I let it execute 10^6 iterations.

@kcc

I've just ran the test manually for a few times with different seeds and it always finds the bug quickly.

That's the simple one, the problematic one for me is ShrinkControlFlowTest.

@kcc Are you saying the behavior should be the same whether or not the (empty) coverage dir is used?

@kcc

for me this test completes in < 1 second even if I let it execute 10^6 iterations.

For me it runs in 1 second on my machine, in 12 seconds on the bot in question. Weird.

kcc added a comment.Jun 27 2018, 6:03 PM

same with ShrinkControlFlowTest. I've run it 10000 times.

Are you saying the behavior should be the same whether or not the (empty) coverage dir is used?

Hm... I would expect so...
But I don't think we provide any kind of hard guarantee for that.

For me it runs in 1 second on my machine, in 12 seconds on the bot in question. Weird.

The bot might be hugely overloaded, which in turn could be causing the troubles with the SIGUSR tests.

@kcc @morehouse
OK seems I have figured out the issue, the bot was running an earlier branch of LLVM, which did not have the OPT_FOR_FUZZING LLVM attribute,
which allowed efficient fuzzing under -O2. Seems like most problems went from there.

Does the bot config also explain the SIGUSR test flakiness?

@morehouse Probably, though test is still inherently flaky.