This is an archive of the discontinued LLVM Phabricator instance.

[libFuzzer] Experimental data flow tracer for fuzz targets.
ClosedPublic

Authored by kcc on May 9 2018, 4:32 PM.

Details

Summary

Experimental data flow tracer for fuzz targets.
Allows to tell which bytes of the input affect which functions of the fuzz target.

We previously attempted to use DFSan directly in the libFuzzer process,
and that didn't work nicely.
Now we will try to collect the data flow information for the seed corpus
in a separate process (using this tracer), and then use it in the regular libFuzzer runs.

Diff Detail

Event Timeline

kcc created this revision.May 9 2018, 4:32 PM
Herald added subscribers: Restricted Project, delcypher. · View Herald TranscriptMay 9 2018, 4:32 PM
Dor1s accepted this revision.May 9 2018, 7:34 PM

LGTM! Left some questions though, mostly for my own education, I guess 😛

lib/fuzzer/dataflow/DataFlow.cpp
74

nit: in other places these use CamelCase: Data, Size

160

nit: do we need trailing \n as on line 158?

161

I don't understand *start condition here. As per lines 161-162, *start would be 0, i.e. false when already initialized? Do we need !*start here?

180

probably a stupid question, but are we sure than __sanitizer_cov_trace_pc_guard gets called before any other hook? Otherwise, how can we be sure that CurrentFunc has a correct value when e.g. __dfsw___sanitizer_cov_trace_switch or __dfsw___sanitizer_cov_trace_* is executed?

190

just to confirm: the hooks defined below will get called by the runtime, when a particular comparison type is executed?

test/fuzzer/dataflow.test
28

I guess order of functions can differ, this is why we use a regexp rather than a particular function number?

This revision is now accepted and ready to land.May 9 2018, 7:34 PM
kcc marked an inline comment as done.May 10 2018, 10:53 AM
kcc added inline comments.
lib/fuzzer/dataflow/DataFlow.cpp
160

doesn't matter, but I unified these.

161

Good point. This is a copy-pasto from the docs where the indexes start from 1, not from 0.
And I think They actually should start from 1 to avoid double-initializtion.
Good catch! Fixed.

180

Yes. __sanitizer_cov_trace_pc_guard is inserted into the beginning of a function entry block.

190

The hooks are inserted by the sanitizer coverage (and then modified by dfsan) -- they will be called before every instrumented comparison insn.

test/fuzzer/dataflow.test
28

Yes, the order of functions is undefined.
Note that the output doesn't contain the function names (it would be too expensive) -- it's just 'F' followed by a number.
See the comment at the top of DataFlow.cpp

kcc updated this revision to Diff 146160.May 10 2018, 10:58 AM

address comments

morehouse accepted this revision.May 10 2018, 11:17 AM
morehouse added inline comments.
lib/fuzzer/dataflow/DataFlow.cpp
60

Nit: Why mix // comment style with /* ... */?

163

I think a simple counter variable would be clearer here.

184

Does DFSan require a second label here since there's 2 parameters in the original signature?

test/fuzzer/dataflow.test
29

IN_ABC-NOT?

kcc marked 2 inline comments as done.May 10 2018, 11:24 AM
kcc added inline comments.
lib/fuzzer/dataflow/DataFlow.cpp
60

This is actually a /* comment, see the first line.
Agree, with // it looks nicer, fixed.

184

The second label will be inserted, but I don't need it, since Cases is a constant.
Added a parameter called UnusedL just for clarity.

kcc updated this revision to Diff 146168.May 10 2018, 11:25 AM

address more comments.

This revision was automatically updated to reflect the committed changes.