This is an archive of the discontinued LLVM Phabricator instance.

Introduce FuzzMutate library
ClosedPublic

Authored by bogner on Aug 3 2017, 10:59 AM.

Details

Summary

This introduces the FuzzMutate library, which provides structured fuzzing for LLVM IR, as described in my EuroLLVM 2017 talk. Most of the basic mutators to inject and delete IR are provided, with support for most basic operations.

I will follow up with the instruction selection fuzzer, which is implemented in terms of this library.

Diff Detail

Repository
rL LLVM

Event Timeline

bogner created this revision.Aug 3 2017, 10:59 AM
bogner added a reviewer: vsk.Aug 11 2017, 1:23 PM
vsk edited edge metadata.Aug 11 2017, 5:07 PM

Very cool. I've just done an initial pass and will need to take a closer look. The main piece of feedback I've got is that adding more docs would really clarify the code. I've pointed out some specific areas which could use doxygen comments. A small writeup in docs/Fuzzing would also be great -- maybe that makes more sense as a separate patch?

include/llvm/FuzzMutate/IRMutator.h
36

This method could use a doxygen comment. To a casual reader, it may not be evident why a mutation strategy has a weight. This would help clarify other mutate* methods below.

78

Out of curiosity, why is this using decl needed? Is 'mutate' somehow ambiguous here?

include/llvm/FuzzMutate/OpDescriptor.h
31

This group could use a doxygen comment. It would help to clarify whether these are pseudo-random constants.

33

IIUC, this could be more clear as: "to answer whether some value generator is suitable as a source for a particular operation".

37

Please add a comment explaining the intended usage of the predicate (e.g you could clarify why the current value is list-like, while the new value isn't).

39

Please add a comment explaining what the inputs to this value generator are supposed to be (e.g you could clarify the constraints on the returned values -- need they be random?).

48

Why have an unused NoneType parameter?

54

Nit: It may enhance readability to capture this and use matches() directly. Feel free to ignore this if you don't feel that way.

124

A for loop would be clearer here. You can also use PointerType::getUnqual, which implies addrspace(0).

include/llvm/FuzzMutate/RandomIRBuilder.h
36

Please add a comment explaining what a source is and where this searches for sources. At first glance, it's not clear if this searches the BB's instructions, the Insts array, or the Srcs array. In general here it would help to clarify how the Insts and Srcs arrays are used.

lib/FuzzMutate/Operations.cpp
43

Have you considered using the FIRST_ICMP_PREDICATE and LAST_ICMP_PREDICATE macros from InstrTypes.h to condense this?

68

Ditto for the FIRST_FCMP_PREDICATE and LAST_FCMP_PREDICATE macros?

110

It looks like this list is duplicated in describeFuzzerIntOpts. Would it make sense to share it with the HANDLE_X pattern?

116

Ditto.

bogner updated this revision to Diff 110828.Aug 11 2017, 6:15 PM
bogner marked 11 inline comments as done.

Added docs and made minor updates based on vsk's comments.

Adding some docs in docs/Fuzzer sounds like a good idea. I'll see about doing that soon but I think it makes sense for that to be a separate patch.

include/llvm/FuzzMutate/IRMutator.h
78

This pulls in the full overload set, so that the base class's mutate(Module) will choose a function and call mutate(Function) here.

include/llvm/FuzzMutate/OpDescriptor.h
48

I found it clearer to force the user to explicitly opt in to using the default generator - SourcePred(P, None) rather than SourcePred(P). If you think that's overkill I could drop this.

54

It's kind of sketchy to use member functions in a constructor. I'd rather not.

lib/FuzzMutate/Operations.cpp
110

These aren't in their own list anywhere today (they're mixed with the float ops in Instructions.def) so I think that would entail creating a new .def file and including that in a couple of places here. I'm not convinced the decreased readability of having to go look up what's in that list is worth the couple of lines of savings here.

vsk added inline comments.Aug 15 2017, 11:27 AM
include/llvm/FuzzMutate/RandomIRBuilder.h
54

Since connectToSink() can create a store, its comment should say it does more than just finding a sink. There may also be an opportunity here to fold newSink() into connectToSink().

lib/FuzzMutate/IRMutator.cpp
57

Use const auto & here? Copying a std::function may not be cheap.

80

Wdyt of looping here until we're sure that the injected IR isn't DCE'd? If it's expensive to evaluate a mutation, we might be able to run faster if we're sure each mutation has done work.

90

Is it worth adding test coverage for this?

120

nit, 'to be valid'

123

Use const auto & here?

181

Wdyt of always considering adding a new source? It'd be sampled along with the other options, with the same weight. I expect that this would increase coverage without causing chaos.

lib/FuzzMutate/RandomIRBuilder.cpp
64

Out of curiosity, are all sources with pointer type LoadInsts?

bogner updated this revision to Diff 111382.Aug 16 2017, 10:37 AM
bogner marked 14 inline comments as done.
bogner added inline comments.
include/llvm/FuzzMutate/RandomIRBuilder.h
54

Its comment already says it may create a new instruction and use that... Maybe I'm misunderstanding you here?

lib/FuzzMutate/IRMutator.cpp
80

The injected IR is never DCE'd - the point of the "sink" concept is that its something that isn't dead.

90

I don't think there's much point - the check would just be essentially the same list of ops.

181

Since the point of the InstDeleter is to decrease size I think it's better as is. When we have to add a new source the size of the IR stays about the same (remove one instruction, add another).

lib/FuzzMutate/Operations.cpp
43

In my previous update I tried this, but I've ended up undoing it now - casting back and forth between the enum values and int to do a loop ends up being uglier than just listing the things.

lib/FuzzMutate/RandomIRBuilder.cpp
64

Not quite, LoadInsts don't have pointer type (they load from a pointer, so you get the pointee type). Currently we only create sources that are some constant or a load from some pointer though, which is what I think you're getting at.

vsk added a comment.Aug 17 2017, 1:13 PM

This looks good to me!

include/llvm/FuzzMutate/RandomIRBuilder.h
54

Sorry, I hadn't fully parsed your second sentence there.

lib/FuzzMutate/IRMutator.cpp
80

Got it.

181

I see, it makes sense to bias towards not adding a source, then.

lib/FuzzMutate/RandomIRBuilder.cpp
64

Got it.

bogner accepted this revision.Aug 22 2017, 10:55 AM
This revision is now accepted and ready to land.Aug 22 2017, 10:55 AM
bogner closed this revision.Aug 22 2017, 10:56 AM

r311402