This is an archive of the discontinued LLVM Phabricator instance.

libFuzzer: prevent irrelevant strings from leaking into auto-dictionary
ClosedPublic

Authored by pdknsk on Jun 30 2018, 2:29 AM.

Details

Summary

This is a fix for bug 37047.

https://bugs.llvm.org/show_bug.cgi?id=37047

Implemented by basically reversing the logic. Previously all strings were considered, with some operations excluded. Now strings are excluded by default, and only strings during the CB considered.

Diff Detail

Repository
rL LLVM

Event Timeline

pdknsk created this revision.Jun 30 2018, 2:29 AM
Herald added subscribers: Restricted Project, llvm-commits. · View Herald TranscriptJun 30 2018, 2:29 AM
kcc added a comment.Jul 2 2018, 4:56 PM

Why is it safe to remove ScopedDoingMyOwnMemOrStr from the places you've removed it from?

pdknsk added a comment.Jul 3 2018, 9:10 AM
In D48800#1150324, @kcc wrote:

Why is it safe to remove ScopedDoingMyOwnMemOrStr from the places you've removed it from?

Note that this removes ScopedDoingMyOwnMemOrStr completely. It's safe because the functions using it run outside the callback. MakeDictionaryEntryFromCMP before, operator== (used in ContainsWord) after, and operator< appears unused.

It'd be different if the functions (TPC.AddValueForMemcmp, TPC.MMT.Add) used inside the hooks (recursively) triggered the hooks, but that's not the case.

kcc added a comment.Jul 3 2018, 2:29 PM
In D48800#1150324, @kcc wrote:

Why is it safe to remove ScopedDoingMyOwnMemOrStr from the places you've removed it from?

Note that this removes ScopedDoingMyOwnMemOrStr completely. It's safe because the functions using it run outside the callback. MakeDictionaryEntryFromCMP before, operator== (used in ContainsWord) after, and operator< appears unused.

It'd be different if the functions (TPC.AddValueForMemcmp, TPC.MMT.Add) used inside the hooks (recursively) triggered the hooks, but that's not the case.

Thanks for checking. Indeed, looks like ScopedDoingMyOwnMemOrStr is now redundant.
I've just removed the stale operator<

Why do you need the new variable InCB?
Will the existing RunningCB not work?

pdknsk added a comment.Jul 5 2018, 3:58 AM
In D48800#1151316, @kcc wrote:

Why do you need the new variable InCB?
Will the existing RunningCB not work?

To be honest, my C++ knowledge becomes wobbly here. I don't think the hooks can access Fuzzer (and its RunningCB).

kcc added a comment.Jul 6 2018, 1:24 PM
In D48800#1151316, @kcc wrote:

Why do you need the new variable InCB?
Will the existing RunningCB not work?

To be honest, my C++ knowledge becomes wobbly here. I don't think the hooks can access Fuzzer (and its RunningCB).

you will need to add one accessor method, just like in your current variant:

bool isRunningCB() const { return RunningCB; }
pdknsk added a comment.Jul 8 2018, 2:25 PM
In D48800#1154720, @kcc wrote:

you will need to add one accessor method, just like in your current variant:

bool isRunningCB() const { return RunningCB; }

That I know. I meant accessing Fuzzer (F). I have a construct that stores *F on TracePC (TPC), but I'm not sure that's good.

yea, you are right, you can't directly use F->RunningCB if you don't have F visible.
But I don't want to have two copies of this flag, one in Fuzzer and one in TracePC.

I would rather just have this flag as a global -- just rename the already existing ScopedDoingMyOwnMemOrStr
into something more suitable (ScopedRunningUserCallback) and get rid of RunningCB.

pdknsk edited the summary of this revision. (Show Details)Jul 14 2018, 8:13 PM

I've removed the performance claim from the summary, because it's not so easy to prove. The problem is that runs even with the same seed are not comparable before and after. I've run the same target using the same 5 seeds for 60 seconds, with the following results.

(before)

1: #455495 DONE cov: 514 ft: 1838 corp: 408/7693b lim: 53 exec/s: 7467 rss: 297Mb
2: #600688 DONE cov: 559 ft: 2052 corp: 479/9439b lim: 48 exec/s: 9847 rss: 197Mb
3: #695888 DONE cov: 630 ft: 2913 corp: 793/11841b lim: 38 exec/s: 11408 rss: 213Mb
4: #273007 DONE cov: 455 ft: 1530 corp: 349/5243b lim: 25 exec/s: 4475 rss: 193Mb
5: #729988 DONE cov: 1198 ft: 4092 corp: 897/16Kb lim: 29 exec/s: 11967 rss: 278Mb

(after)

1: #236342 DONE cov: 432 ft: 1451 corp: 320/4767b lim: 29 exec/s: 3874 rss: 246Mb
2: #695694 DONE cov: 636 ft: 2616 corp: 724/12693b lim: 48 exec/s: 11404 rss: 248Mb
3: #933497 DONE cov: 629 ft: 2719 corp: 744/14398b lim: 53 exec/s: 15303 rss: 269Mb
4: #865391 DONE cov: 1211 ft: 4827 corp: 1102/26Kb lim: 48 exec/s: 14186 rss: 242Mb
5: #694553 DONE cov: 579 ft: 2181 corp: 492/11882b lim: 68 exec/s: 11386 rss: 205Mb

pdknsk updated this revision to Diff 155579.Jul 14 2018, 8:35 PM

Even simpler now than it was.

This revision is now accepted and ready to land.Jul 16 2018, 9:27 AM
kcc accepted this revision.Jul 16 2018, 12:27 PM

LGTM

Matt, please test locally and land.

This revision was automatically updated to reflect the committed changes.

@morehouse @vitalybuka @kcc this breaks shrink.test, namely, the run line:

/Volumes/Transcend/code/llvm/relbuild-with-asserts/projects/compiler-rt/test/fuzzer/Output/shrink.test.tmp-ShrinkControlFlowTest -seed=1 -exit_on_item=0eb8e4ed029b774d80f2b66408203801cb982a60 -runs=1000000 -shrink=1 -reduce_inputs=0

Any ideas on what went wrong here?

@kcc I can fix the test by doubling the limit given to -runs, not sure whether it's a correct course of actions.

Moreover, the test passes on x86_64, but fails on x86_64h.

We should probably just increase the runs for shrink.test. This patch changes how items are added to libFuzzer's dictionary, which will change the exact sequence of inputs libFuzzer tries for a given seed.

When I landed this I had to modify three-bytes.test as well because this patch made libFuzzer hit a line sooner than we expected in that test.