This is an archive of the discontinued LLVM Phabricator instance.

llvm-mc-fuzzer: A fuzzing tool for the MC layer.
ClosedPublic

Authored by dsanders on Sep 9 2015, 3:59 AM.

Details

Summary

Only the disassembler is supported in this patch but it has already found a few
issues in the Mips disassembler (mostly invalid instructions being successfully
disassembled).

Diff Detail

Event Timeline

dsanders updated this revision to Diff 34315.Sep 9 2015, 3:59 AM
dsanders retitled this revision from to llvm-mc-fuzzer: A fuzzing tool for the MC layer..
dsanders updated this object.
dsanders added a subscriber: llvm-commits.

The recent threads on libFuzzer inspired me to try it out and llvm-mc-fuzzer is
the result. It seems to be useful and has already found a few issues with the
Mips disassembler, so I thought I'd post the patch to see if there's interest
in bringing this upstream.

I forgot to mention: I also have some python scripts to convert our disassembler test inputs to the raw binary format used by libFuzzer and back. I haven't included those in this patch since they are just quick hacks. They'll need implementing properly to be suitable for upstreaming.

kcc added a subscriber: kcc.Sep 9 2015, 9:52 AM

Nice!

Could you please also add a line about your findings
to http://llvm.org/docs/LibFuzzer.html#trophies
(or tell me what to add there) ?

Once committed, I'll add it to the fuzzer bot:
lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer

tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp
23

please use a C++ constant

63

why not vector?

82

Since the command line is unusual (compared to other uses of libFuzzer)
please add a comment with usage example(s)

104

Do you really need to copy these here?
Why not just pass I.c_str()?

110

Do you need it to be this complex?
For assembling you can write a separate target binary

dsanders marked 3 inline comments as done.Sep 10 2015, 2:30 AM
dsanders added inline comments.
tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp
63

I didn't give it much thought. I'll switch it to a vector.

104

Unfortunately, yes but there is an alternate solution. The problem is that c_str() returns a const char * but fuzzer::FuzzerDriver() expects an array of char *. It's unsafe to just drop the const-ness so I make a non-const copy.

If fuzzer::FuzzerDriver's second was const char ** then I could avoid the copy.

110

My thinking was that it would be nice if llvm-mc-fuzzer had a similar command line to llvm-mc. If it's preferred to use a separate binary then I don't mind doing that.

dsanders updated this revision to Diff 34421.Sep 10 2015, 2:32 AM
dsanders marked an inline comment as done.

Replaced macro with C++ constant.
Added trophy to libFuzzer documentation.
Changed DataCopy to a std::vector.
Added usage examples.

kcc added inline comments.Sep 10 2015, 10:01 AM
docs/LibFuzzer.rst
452–459

Links maybe?

tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp
99

typeo: contrain

105

I've just added more interface variants:
Can you try this one?

int FuzzerDriver(const std::vector<std::string> &Args, UserCallback Callback);

111

Ok, makes sense

silvas added a subscriber: silvas.Sep 10 2015, 11:34 AM

Nice! I know Russell had been looking at using fuzz-testing to test
round-tripping through assembly, which seems like a perfect fit for a
libFuzzer-based tool. Russell, is this something that you are still working
on? Maybe llvm-mc-fuzzer will grow that functionality some day.

  • Sean Silva

I found a number of bugs comparing direct object emission and via assembly with the check_cfc tool. Sean I and spoke about using fuzzing in that area. I expect it would find issues but I haven't done any work on it so no objections if someone wanted to add this to llvm-mc-fuzzer.

dsanders updated this revision to Diff 34662.Sep 14 2015, 2:43 AM

Added links to libFuzzer.rst and better explained it.
Fixed contrain -> constrain.
Switched to the new FuzzerDriver() interface.

dsanders marked 9 inline comments as done.Sep 14 2015, 2:56 AM

Nice! I know Russell had been looking at using fuzz-testing to test
round-tripping through assembly, which seems like a perfect fit for a
libFuzzer-based tool. Russell, is this something that you are still working
on? Maybe llvm-mc-fuzzer will grow that functionality some day.

  • Sean Silva

I'm keen to get that functionality too. Mips's move instructions will be a bit troublesome here since many distinct opcodes disassemble to 'move $1, $2' but that string only assembles to a single opcode.

One feature that would be helpful from the Fuzzer is the ability for the callback to be able to classify inputs into various bins. For example, "this input is invalid", "this input disassembled but failed to complete the round trip", "this input completed a round trip but the encodings don't match", etc. At the moment, we need to determine this when converting inputs into test cases which seems redundant when the callback already knew what happened.

tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp
105

That worked nicely. Thanks

kcc added a comment.Sep 14 2015, 9:20 AM

One feature that would be helpful from the Fuzzer is the ability for the callback to be able to classify inputs into various bins. For example, "this input is invalid", "this input disassembled but failed to complete the round trip", "this input completed a round trip but the encodings don't match", etc. At the moment, we need to determine this when converting inputs into test cases which seems redundant when the callback already knew what happened.

Yes, I've seen similar requests already.
goFuzz does it this way:

The function must return 1 if the input is interesting in some way (for example, it was parsed successfully, that is, it is lexically correct, go-fuzz will give more priority to such inputs); -1 if the input must not be added to corpus even if gives new coverage; and 0 otherwise; other values are reserved for future use.

So, I'll probably add some similar functionality (probably not in the nearest two weeks though).

kcc accepted this revision.Sep 14 2015, 9:23 AM
kcc added a reviewer: kcc.

LGTM, thanks for doing this!

tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp
127

I would just call fuzzer::FuzzerDriver and not create the "TestOneInput" temporary.
But feel free to ignore this comment.

This revision is now accepted and ready to land.Sep 14 2015, 9:23 AM
dsanders closed this revision.Sep 16 2015, 4:51 AM
dsanders marked an inline comment as done.