This is an archive of the discontinued LLVM Phabricator instance.

[libFuzzer] Experimental data flow tracer for fuzz targets.
ClosedPublic

Authored by kcc on May 9 2018, 4:32 PM.

Details

Summary

Experimental data flow tracer for fuzz targets.
Allows to tell which bytes of the input affect which functions of the fuzz target.

We previously attempted to use DFSan directly in the libFuzzer process,
and that didn't work nicely.
Now we will try to collect the data flow information for the seed corpus
in a separate process (using this tracer), and then use it in the regular libFuzzer runs.

Event Timeline

kcc created this revision.May 9 2018, 4:32 PM
Herald added subscribers: Restricted Project, delcypher. · View Herald TranscriptMay 9 2018, 4:32 PM
Dor1s accepted this revision.May 9 2018, 7:34 PM

LGTM! Left some questions though, mostly for my own education, I guess 😛

lib/fuzzer/dataflow/DataFlow.cpp
73

nit: in other places these use CamelCase: Data, Size

159

nit: do we need trailing \n as on line 158?

160

I don't understand *start condition here. As per lines 161-162, *start would be 0, i.e. false when already initialized? Do we need !*start here?

179

probably a stupid question, but are we sure than __sanitizer_cov_trace_pc_guard gets called before any other hook? Otherwise, how can we be sure that CurrentFunc has a correct value when e.g. __dfsw___sanitizer_cov_trace_switch or __dfsw___sanitizer_cov_trace_* is executed?

189

just to confirm: the hooks defined below will get called by the runtime, when a particular comparison type is executed?

test/fuzzer/dataflow.test
27

I guess order of functions can differ, this is why we use a regexp rather than a particular function number?

This revision is now accepted and ready to land.May 9 2018, 7:34 PM
kcc marked an inline comment as done.May 10 2018, 10:53 AM
kcc added inline comments.
lib/fuzzer/dataflow/DataFlow.cpp
159

doesn't matter, but I unified these.

160

Good point. This is a copy-pasto from the docs where the indexes start from 1, not from 0.
And I think They actually should start from 1 to avoid double-initializtion.
Good catch! Fixed.

179

Yes. __sanitizer_cov_trace_pc_guard is inserted into the beginning of a function entry block.

189

The hooks are inserted by the sanitizer coverage (and then modified by dfsan) -- they will be called before every instrumented comparison insn.

test/fuzzer/dataflow.test
27

Yes, the order of functions is undefined.
Note that the output doesn't contain the function names (it would be too expensive) -- it's just 'F' followed by a number.
See the comment at the top of DataFlow.cpp

kcc updated this revision to Diff 146160.May 10 2018, 10:58 AM

address comments

morehouse accepted this revision.May 10 2018, 11:17 AM
morehouse added inline comments.
lib/fuzzer/dataflow/DataFlow.cpp
59

Nit: Why mix // comment style with /* ... */?

162

I think a simple counter variable would be clearer here.

183

Does DFSan require a second label here since there's 2 parameters in the original signature?

test/fuzzer/dataflow.test
28

IN_ABC-NOT?

kcc marked 2 inline comments as done.May 10 2018, 11:24 AM
kcc added inline comments.
lib/fuzzer/dataflow/DataFlow.cpp
59

This is actually a /* comment, see the first line.
Agree, with // it looks nicer, fixed.

183

The second label will be inserted, but I don't need it, since Cases is a constant.
Added a parameter called UnusedL just for clarity.

kcc updated this revision to Diff 146168.May 10 2018, 11:25 AM

address more comments.

This revision was automatically updated to reflect the committed changes.