This is an archive of the discontinued LLVM Phabricator instance.

[LLDB][RISCV][NFC] Rewrite instruction in algebraic datatype
ClosedPublic

Authored by Emmmer on Oct 1 2022, 7:59 AM.

Details

Summary

The old approach (dedicated ExecXXX for each instruction) is not flexible and results in duplicated code when RVC kicks in.

According to the spec, every compressed instruction can be decoded to a non-compressed one. So we can lower compressed instructions to instructions we already had, which requires a decoupling between the decoder and executor.

This patch:

  • use llvm::Optional and its combinators AMAP.
  • use template constraints on common instruction.
  • make instructions strongly-typed (no uint32_t everywhere bc it is error-prone and burdens the developer when lowering the RVC) with the help of algebraic datatype (std::variant).

Note:
(NFC) because this is more of a refactoring in preparation for RVC.

Diff Detail

Event Timeline

Emmmer created this revision.Oct 1 2022, 7:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 1 2022, 7:59 AM
Emmmer requested review of this revision.Oct 1 2022, 7:59 AM

I'm not sure that we can't use variant just yet. Quoting from the last time I checked:

Variant is supported from:
libstdc++ 7.1
VS 2017 15.0
libcxx 4.0

So far so good relative to https://llvm.org/docs/GettingStarted.html#host-c-toolchain-both-compiler-and-standard-library.

https://en.cppreference.com/w/cpp/compiler_support/17#C.2B.2B17_library_features claims it is in apple clang 10.0.0 so unfortunately we're not quite there as the minimum is 9.3.

That said I do see variant in dotted around in mlir and one ADT header, so perhaps it's fine to use now?

So we can lower compressed instructions to instructions we already had...

We a decompiler now :). This is a very cool way to go.

(but I get that this patch doesn't add that just yet)

Overall this looks good, bit of a gnarly diff but just reading the new code its fairly clear. Please get some confirmation that std::variant is actually allowed (asking on discourse is a good idea).

Yes we switched to c++17 for the language mode but library features are still as provided by the minimum toolchain versions.

I hope it is allowed though I think this is a good use of it (and transform I knew it existed but first time seeing code use it).

lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp
36

Mark this static.

Also please document the behaviour in comments. For instance, does this stop at the first optional that is empty or does the whole thing return empty optional if one of them is empty?

106

Do you need this-> here? Write is a method of Rd.

111

You can construct directly with explicit RegisterValue(uint64_t inst) if you prefer. But there is some merit in using SetUInt64 to be clear what you intend so up to you.

114

newline after this closing }

118

I think I asked this before but can this just do return int32_t(value) ?

And something tells me the answer was yes you could , but it's undefined behaviour.

171–172

What is amo?

307

You could const this just to prevent future mishaps using the wrong pc variable.

488

m_ prefix for class members

1062–1067

Small thing, you could do m_addr = addr.value_or(LLDB_INVALID_ADDRESS); then do the if (!addr).

lldb/source/Plugins/Instruction/RISCV/RISCVInstructions.h
25

Could all the emulator params for this and Rs be const &?

37

You could save some lines by making a macro per layout, if that makes sense.

#define INST_RD_IMM(name) ...
<...>
#undef INST_RD_IMM

Assuming you only have 7/8 different formats to handle. Can always do the rare ones without.

Emmmer updated this revision to Diff 465017.Oct 4 2022, 8:04 AM
Emmmer marked 9 inline comments as done.

address review comments.

Emmmer added a comment.Oct 4 2022, 8:09 AM

Overall this looks good, bit of a gnarly diff but just reading the new code its fairly clear. Please get some confirmation that std::variant is actually allowed (asking on discourse is a good idea).

Yes we switched to c++17 for the language mode but library features are still as provided by the minimum toolchain versions.

I hope it is allowed though I think this is a good use of it (and transform I knew it existed but first time seeing code use it).

I see std::variant used in flang, TableGen and some CodeGen passes. So we can safely choose std::variant in the new code.

lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp
171–172

The spec uses amo as a short for atomic.

lldb/source/Plugins/Instruction/RISCV/RISCVInstructions.h
25

Sadly no, the EmulateInstruction::ReadRegister does not have a const qualifier on this

DavidSpickett accepted this revision.Oct 4 2022, 8:17 AM

LGTM

lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp
171–172

That is...an interesting choice. :)

But ok this is riscv code let's stick with their names.

This revision is now accepted and ready to land.Oct 4 2022, 8:17 AM
jrtc27 added inline comments.Oct 4 2022, 8:18 AM
lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp
171–172

"atomic memory operation"

DavidSpickett added inline comments.Oct 4 2022, 8:22 AM
lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp
171–172

Ok now I get it, that makes much more sense. I was thinking it was "A-to-M-ic-Operation" which would be quite the stretch.