This is an archive of the discontinued LLVM Phabricator instance.

Adding proto-fuzzer support for more extensions with new driver flow
AbandonedPublic

Authored by jocewei on Aug 1 2018, 3:35 PM.

Details

Reviewers
apazos
mgrang
asb
Summary
  • Uses Protobuf to represent RV32 General (G) Instruction Set with extensions I,M,A,F,D,C, with no constraints on either register or immediate operand values (that is, with no constraints on register class or immediate range). Also includes pseudoinstructions.
  • Updated fuzzer driver flow:

(1) to invoke fuzzed llvm-mc assembler, GNU assembler, and GNU disassembler.
(2) To use the status of calling each tool to categorize test failures.

  • Created single fuzzer executable that supports rv32imafdc.
  • Kept the example fuzzer with sample grammar and corresponding Python script.

Related: https://reviews.llvm.org/D49521

Diff Detail

Event Timeline

jocewei created this revision.Aug 1 2018, 3:35 PM
asb added a comment.Aug 2 2018, 8:31 AM

Hi Jocelyn. What should I be referring to as the base of this patch? e.g. D49521 doesn't contain proto_to_asm_rv32i.cpp?

I might be looking at the wrong version of the protobuf API or overlooking other issues here, but might it be possible to use FindValueByNumber to get hold of an EnumDescriptor and hence a string name for protobuf enums, getting rid of some of the boilerplate in proto_to_asm? https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor#EnumDescriptor.FindValueByNumber.details

In D50164#1185810, @asb wrote:

Hi Jocelyn. What should I be referring to as the base of this patch? e.g. D49521 doesn't contain proto_to_asm_rv32i.cpp?

This patch adds two new files: proto_to_asm_rv32i.cpp and rv32i_asm.proto. These files add to proto_to_asm.cpp and asm_proto.proto from the previous patch (https://reviews.llvm.org/D49521), which we chose to leave untouched so as to preserve the simpler example grammar that only contains the add and sub instructions.

I might be looking at the wrong version of the protobuf API or overlooking other issues here, but might it be possible to use FindValueByNumber to get hold of an EnumDescriptor and hence a string name for protobuf enums, getting rid of some of the boilerplate in proto_to_asm? https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor#EnumDescriptor.FindValueByNumber.details

In proto_to_asm, to handle the protobuf enums, we can use int values 0-31 instead of Register::X0 through Register::X31, or int values 0-9 instead of RTypeOpcode_Op_ADD, etc. But this means that if the order of items in the protobuf enum cannot change without us also modifying the corresponding proto_to_asm code. Based on what I read, I'm not sure if using EnumDescriptor / EnumValueDescriptor would be advantageous.

apazos added inline comments.Aug 2 2018, 5:12 PM
tools/llvm-mc-assemble-proto-fuzzer/proto-files/rv32i_asm.proto
113 ↗(On Diff #158653)

When you have the Load instructions ready, upload to this same patch so it has RV32I complete in this patch, except for the CSR instrs.

asb added a comment.Aug 3 2018, 4:14 AM
In D50164#1185810, @asb wrote:

Hi Jocelyn. What should I be referring to as the base of this patch? e.g. D49521 doesn't contain proto_to_asm_rv32i.cpp?

This patch adds two new files: proto_to_asm_rv32i.cpp and rv32i_asm.proto. These files add to proto_to_asm.cpp and asm_proto.proto from the previous patch (https://reviews.llvm.org/D49521), which we chose to leave untouched so as to preserve the simpler example grammar that only contains the add and sub instructions.

But the patch as upload modifies a few lines of proto_to_asm_rv32i.cpp rather than adding it from scratch, which is why I wondered if there's an intermediate patch or a problem with the diff generation.

I might be looking at the wrong version of the protobuf API or overlooking other issues here, but might it be possible to use FindValueByNumber to get hold of an EnumDescriptor and hence a string name for protobuf enums, getting rid of some of the boilerplate in proto_to_asm? https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor#EnumDescriptor.FindValueByNumber.details

In proto_to_asm, to handle the protobuf enums, we can use int values 0-31 instead of Register::X0 through Register::X31, or int values 0-9 instead of RTypeOpcode_Op_ADD, etc. But this means that if the order of items in the protobuf enum cannot change without us also modifying the corresponding proto_to_asm code. Based on what I read, I'm not sure if using EnumDescriptor / EnumValueDescriptor would be advantageous.

I was just thinking of getting rid of the manual enum -> string conversion. Looks like you'd need to convert the protobuf-generated string to lower-case though.

In D50164#1187184, @asb wrote:
In D50164#1185810, @asb wrote:

Hi Jocelyn. What should I be referring to as the base of this patch? e.g. D49521 doesn't contain proto_to_asm_rv32i.cpp?

This patch adds two new files: proto_to_asm_rv32i.cpp and rv32i_asm.proto. These files add to proto_to_asm.cpp and asm_proto.proto from the previous patch (https://reviews.llvm.org/D49521), which we chose to leave untouched so as to preserve the simpler example grammar that only contains the add and sub instructions.

But the patch as upload modifies a few lines of proto_to_asm_rv32i.cpp rather than adding it from scratch, which is why I wondered if there's an intermediate patch or a problem with the diff generation.

You're right, there was a missing patch, now added: https://reviews.llvm.org/D50262

I might be looking at the wrong version of the protobuf API or overlooking other issues here, but might it be possible to use FindValueByNumber to get hold of an EnumDescriptor and hence a string name for protobuf enums, getting rid of some of the boilerplate in proto_to_asm? https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor#EnumDescriptor.FindValueByNumber.details

In proto_to_asm, to handle the protobuf enums, we can use int values 0-31 instead of Register::X0 through Register::X31, or int values 0-9 instead of RTypeOpcode_Op_ADD, etc. But this means that if the order of items in the protobuf enum cannot change without us also modifying the corresponding proto_to_asm code. Based on what I read, I'm not sure if using EnumDescriptor / EnumValueDescriptor would be advantageous.

I was just thinking of getting rid of the manual enum -> string conversion. Looks like you'd need to convert the protobuf-generated string to lower-case though.

Okay, that makes sense. I'll try that, thanks!

jocewei updated this revision to Diff 159041.Aug 3 2018, 10:38 AM

Added Load instructions to fuzzer's Protobuf representation of the RV32I ISA (rv32i_asm.proto and proto_to_asm_rv32i.cpp)

apazos added a comment.Aug 4 2018, 8:15 PM

Jocelyn https://reviews.llvm.org/D50262 is unfinished work and should not be pushed, it is confusing the review.
You should update this patch with any missing files and with the latest version you have. Abandon https://reviews.llvm.org/D50262.

apazos added a comment.Aug 4 2018, 9:04 PM

I think Alex's suggestion is doable. We should try to simplify the printing of the instructions.

const EnumValueDescriptor * D = FindValueByNumber(X.name());
if (D) {

OS << D.name(); 
return OS;

}
assert(..);

jocewei retitled this revision from Added new driver flow for unconstrained fuzzer to Adding proto-fuzzer support for more extensions with new driver flow.Aug 6 2018, 2:52 PM
jocewei edited the summary of this revision. (Show Details)
jocewei updated this revision to Diff 159402.Aug 6 2018, 3:09 PM

Rebased patch and added fuzzer support for more extensions (m,c)

jocewei updated this revision to Diff 159833.Aug 8 2018, 5:30 PM
jocewei edited the summary of this revision. (Show Details)

Cleaned up proto_to_asm, simplified grammar to have fewer messages and be more easily extensible

jocewei updated this revision to Diff 160944.Aug 15 2018, 4:43 PM
jocewei edited the summary of this revision. (Show Details)

Added missing instructions to rv32i.proto and rv32c.proto

jocewei updated this revision to Diff 161546.Aug 20 2018, 12:54 PM
jocewei edited the summary of this revision. (Show Details)

Updated with pseudoinstructions added to the Protobuf

jocewei abandoned this revision.Aug 24 2018, 2:50 PM

Abandoning revision because tool has been moved to llvm/tools from clang/tools, as of this patch: https://reviews.llvm.org/D51144

tools/llvm-mc-assemble-proto-fuzzer/proto-to-asm/example_proto_to_asm.cpp