[RISCV] MC layer support for the load/store instructions of standard compress instruction set
ClosedPublic

Authored by shiva0217 on Mon, Nov 13, 11:22 PM.

Details

Summary
  1. Define SubtargetFeature and Predicate for C extension
  1. Add IsRV64 Predicate for RV64 only C extension instructions
  1. Support decode 16-bit instructions
  1. Support encode 16-bit instructions
  1. Add GPRC, SP,GPRNonX0 register class for C extension load/store instructions
  1. Add compress InstFormat enum format
  1. TSFlags in RISCVInstrFormats.td/RISCVInstrFormatsC.td have increase 1 bit
  1. Add CI CSS CL CS instruction format for C extension load/store instructions
  1. Add C extension load/store instructions
  1. Support decode GPRC,GPRNonX0 register class
  1. Identify/check UImm8Lsb00/UImm7Lsb00/UImm8Lsb000/UImm9Lsb000 operands by RISCVAsmParser
  1. Add imply SP operand while decoding CLWSP/CLWLP/CLDSP/CSDSP
  1. Add rv32c-valid.s, rv32c-invalid.s, rv64c-valid.s, rv64c-invalid.s MC testcases

Diff Detail

Repository
rL LLVM
There are a very large number of changes, so older changes are hidden. Show Older Changes
simoncook added inline comments.Tue, Nov 14, 1:49 AM
lib/Target/RISCV/RISCV.td
29 ↗(On Diff #122779)

Should this be "'C' (Compressed Instructions)", and presumably aligned similarly with FeatureStdExtA?

test/MC/RISCV/compressed-invalid.s
1 ↗(On Diff #122779)

This test is missing the triple to run the assembler for. I would expect a "-triple=riscv32" or "-triple=riscv64" here.

test/MC/RISCV/compressed-valid.s
1 ↗(On Diff #122779)

This test is missing the triple to run the assembler for. I would expect a "-triple=riscv32" or "-triple=riscv64" here.

This revision now requires changes to proceed.Tue, Nov 14, 1:49 AM
asb added a comment.Tue, Nov 14, 12:03 PM

Thanks Shiva. A whole bunch of inline comments are below. Thanks to Simon for spotting the missing -triple in the tests, I was struggling to see why I couldn't get the tests to pass.

compressed-valid.s and compressed-invalid.s should be named rv32c-valid.s and rv32c-invalid.s to match the naming conventions of the existing tests. You can also add check lines for a -riscv64 triple. RV64C specific instructions can go in rv64c-valid.s

lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
212 ↗(On Diff #122779)

These methods are sorted by bit-width of the operand, so should be in the order isUImm5, isUImm6Lsb00, isUImm8Lsb00.

424 ↗(On Diff #122779)

Should ideally be ordered like the methods.

[InvalidFenceArg doesn't match the method order here, I should fix that next time that piece of code gets touched].

lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
87 ↗(On Diff #122779)

Shouldn't you be checking RegNo > 8?

172 ↗(On Diff #122779)

The previous code was using support::endian::read32le(Bytes.data()). Can't you use support::endian::read16le(Bytes.data())) for C instructions and avoid adding the extra helper functions?

lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
32 ↗(On Diff #122779)

I've tried to keep the instruction formats matching those in the ISA manual. I'd be inclined to use more bits of TSFlags and have the instruction format names match those in Table 12.1 in the ISA manual.

lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
86 ↗(On Diff #122779)

TmpInst seems unnecessary.

94 ↗(On Diff #122779)

Replace the assert a couple of lines above with an llvm_unreachable here?

lib/Target/RISCV/RISCVInstrFormats.td
41 ↗(On Diff #122779)

Same comment as before regarding InstFormats

lib/Target/RISCV/RISCVInstrFormatsC.td
15 ↗(On Diff #122779)

Indentation

31 ↗(On Diff #122779)

I'd prefer using opcodestr+argstr like the formats in RISCVInstrFormats.td. I found this reduced the amount of string manipulation needed overall.

lib/Target/RISCV/RISCVInstrInfo.cpp
55 ↗(On Diff #122779)

These RISCVInstrInfo.cpp changes should go in a later codegen patch, as they're not necessary for MC layer support

lib/Target/RISCV/RISCVInstrInfoC.td
60 ↗(On Diff #122779)

I'd prefer to use headers similarly to RISCVInstrInfo.td. i.e. for "Instruction class templates" and "Instructions" - we can always change later if we want, keeping things consistent across all the .td files.

67 ↗(On Diff #122779)

Maybe I'm missing something, but can't the CL class take care of placing these immediate bits where appropriate (same for the instructions below)

shiva0217 edited the summary of this revision. (Show Details)

Thanks, Simon, I adding the missing -triple argument in testcases and fix the typo.
Thanks, Alex, I rename the tesecase to rv32/64c-valid/invalid.s and reorganize the code as you suggest.

lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
212 ↗(On Diff #122779)

Sorting the method as @asb suggest.

424 ↗(On Diff #122779)

Sorting the method as @asb suggest.
If the order still not as you expect, please feel free to reorder the method in next patch.

lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
87 ↗(On Diff #122779)

Yes, should check >8.
Thanks.

172 ↗(On Diff #122779)

Using support::endian::read32le(Bytes.data()) and support::endian::read16le(Bytes.data())) to avoid adding extra helper functions.

lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
32 ↗(On Diff #122779)

Adding all compress instruction InstFormat and increase TSFlags 1 bit.

lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
86 ↗(On Diff #122779)

remove redundant TmpInst.

94 ↗(On Diff #122779)

Adding llvm_unreachable in default which is the case that instruction length, not 2 or 4.

lib/Target/RISCV/RISCV.td
29 ↗(On Diff #122779)

Yes, Thanks, fix the typo here.

lib/Target/RISCV/RISCVInstrFormats.td
41 ↗(On Diff #122779)

Add all compress instruction InstFormats

lib/Target/RISCV/RISCVInstrFormatsC.td
15 ↗(On Diff #122779)

Fix the Indentation

31 ↗(On Diff #122779)

Using opcodestr+argstr like the formats in RISCVInstrFormats.td.

lib/Target/RISCV/RISCVInstrInfo.cpp
55 ↗(On Diff #122779)

Remove the code which will add later in C codegen support patch.

lib/Target/RISCV/RISCVInstrInfoC.td
60 ↗(On Diff #122779)

Remove other header names.
Using the same naming "Instructions" as in RISCVInstrInfo.td

67 ↗(On Diff #122779)

The reason we put immediate encode in CLW is that CLD use the same encoding format(CL) except the immediate encoding is different.
To reuse the format class, we extract the immediate encoding in each instruction.

test/MC/RISCV/compressed-invalid.s
1 ↗(On Diff #122779)

Adding missing -triple in testcases.

test/MC/RISCV/compressed-valid.s
1 ↗(On Diff #122779)

Adding missing -triple in testcases.

Thanks, I found one nitpick with a couple of the tests, but the rest looks good to me.

test/MC/RISCV/rv32c-valid.s
1 ↗(On Diff #122980)

Should this be < %s instead to be consistent with the other calls to llvm-mc below?

test/MC/RISCV/rv64c-valid.s
1 ↗(On Diff #122980)

Likewise with rv32c-valid.s

shiva0217 updated this revision to Diff 123112.Wed, Nov 15, 5:32 PM
shiva0217 added inline comments.Wed, Nov 15, 5:35 PM
test/MC/RISCV/rv32c-valid.s
1 ↗(On Diff #122980)

Make the call llvm-mc command in test case consistent as Simon suggest.

test/MC/RISCV/rv64c-valid.s
1 ↗(On Diff #122980)

Make the call llvm-mc command in test case consistent as Simon suggest.

mgrang added a subscriber: mgrang.Fri, Nov 17, 10:55 AM
mgrang added inline comments.
lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
96 ↗(On Diff #123112)

Period after comment.

lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
87 ↗(On Diff #123112)

Period after comment.

98 ↗(On Diff #123112)

Can you order the cases in the ascending order of Size?

Shiva, next time you update this patch (and the other patches), can you please generate the patch with more context?
git diff HEAD~1 HEAD -U9999999999
See https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface

lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
87 ↗(On Diff #123112)

no need to add { }

149 ↗(On Diff #123112)

if bit 0 and 1 are 1.

shiva0217 updated this revision to Diff 123533.Sun, Nov 19, 9:53 PM
shiva0217 added a subscriber: llvm-commits.

Hi @mgrang, @apazos

Thanks for the comments.
I update the patch using "-U 9999999999" and also using clang-format to check indentation.

lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
87 ↗(On Diff #123112)

remove unnecessary {}.

96 ↗(On Diff #123112)

Add period after comment.

149 ↗(On Diff #123112)

Fixup comments.

lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
87 ↗(On Diff #123112)

Add period after comment.

98 ↗(On Diff #123112)

Order the case by ascending order as @mgrang suggest.

asb added a comment.Tue, Nov 21, 5:37 AM

Hi Shiva, now I've got this building on top of head I see I was incorrect to suggest removing the RISCVInstrInfo.cpp changes, as you get assertions in some of the tests if you don't. I haven't done much work with cases like this where you have multiple register classes with overlapping contents but I _think_ the most correct fix to storeRegToStackSlot and loadRegtoStackSlot is to use if (RISCV::GPRRegClass.hasSubClassEq(RC)) rather than if (RC == &RISCV::GPRRegClasss). The original code probably should have been written using this in the first place.

asb added a comment.Tue, Nov 21, 6:36 AM

Hi Shiva, I think we're almost there. Primarily naming and style nits below. The most important issue to resolve is the implied SP operand I think.

lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
97 ↗(On Diff #123533)

Is there a reason we can't use let Uses = [X2] and/or let Defs = [X2] for instructions that implicitly read/write the stack pointer register?

lib/Target/RISCV/RISCVInstrFormatsC.td
37 ↗(On Diff #123533)

I started off by using very short class names in RISCVInforFormats.td, but found it's ultimately easy to confuse a class name with an instruction. RVInstCI or RVInst16CI and so on would probably be preferable.

lib/Target/RISCV/RISCVInstrInfoC.td
67 ↗(On Diff #122779)

I see. In that case, it would make sense to add a comment to the CL and CS classes that explains subclasses must set the immediate bits. An alternative could be to define different classes for these variants as is done with RVInstIShift and RVInstIShiftW.

16 ↗(On Diff #123533)

"where the least significant two bits are zero." (same fix for other operand types defined below).

53 ↗(On Diff #123533)

StackLoad_ri or `CStackLoad would match the naming convention used in RISCVInstrInfo.td and others. Same for the other classes. See also my query regarding the sp operand and whether we can use Defs and Uses instead.

80 ↗(On Diff #123533)

It would match the other RISCVInstrInfo*.td files use let Predicates = [HasStdExtC] in a way similar to a C++ namespace.

80 ↗(On Diff #123533)

Just one space is needed after the instruction name and before :

86 ↗(On Diff #123533)

I try to keep the MC layer instruction definitions as close to the ISA manual as possible, ordering based on the the quickref table on page 104 of the spec. It might be nice to order based on the tables in section 12.7 "RVC Instruction Set Listings" (page 82 and 83). There's probably less benefit in doing so than for the RV32IMAFD extensions though, so it's up to you. Generally, it's nice to be able to see at a glance that every instruction in that listing is represented.

test/MC/RISCV/rv32c-invalid.s
10 ↗(On Diff #123533)

It would be good to check values that are just outside the valid range.

test/MC/RISCV/rv32c-valid.s
3 ↗(On Diff #123533)

There's no need to explicitly state +64bit in this or the other tests.

13 ↗(On Diff #123533)

It would be good to check the maximum and minimum immediate values (same with other *-valid.s tests).

shiva0217 set the repository for this revision to rL LLVM.

Hi Alex, if (RISCV::GPRRegClass.hasSubClassEq(RC)) in storeRegToStackSlot and loadRegtoStackSlot looks like a good a solution to me. About imply SP I have explained in the inline comments, let me know if you have other thought about it. Thanks for your comments, I really appreciate that.

lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
97 ↗(On Diff #123533)

The reason is that InstPrinter has to print SP operand "c.lwsp ra, 12(sp)".
We could print sp as string and remove it in input operand list rather than add imply SP,
but the parser will parse sp as input operand and we'll need to add other logic to deal with it.
If we add imply SP operand while disassembling, then printer and parser could work fine without extra code.

lib/Target/RISCV/RISCVInstrFormatsC.td
37 ↗(On Diff #123533)

Rename the instruction formats for compress instructions with "RVInst16" prefix.

lib/Target/RISCV/RISCVInstrInfoC.td
67 ↗(On Diff #122779)

Adding comments to CL and CS classes that subclasses must set the immediate bits.

16 ↗(On Diff #123533)

Fixup comments.

53 ↗(On Diff #123533)

Rename the class to CStackLoad to match the naming rule in RISCVInstrInfo.td.

80 ↗(On Diff #123533)

Using let Predicates = [HasStdExtC] instead of using require.

86 ↗(On Diff #123533)

OK, I'll list the instructions base on the table in section 12.7 "RVC Instruction Set Listings" (page 82 and 83). Let me know if I still misorder something.

test/MC/RISCV/rv32c-invalid.s
10 ↗(On Diff #123533)

Testing the invalid value just outside the valid range.

test/MC/RISCV/rv32c-valid.s
3 ↗(On Diff #123533)

Remove redundant +64bit argument in the testcase.

13 ↗(On Diff #123533)

check the maximum and minimum immediate values in -valids tests.

asb added a comment.Thu, Nov 23, 1:55 AM

Many many thanks for the patch update Shiva, this is looking pretty good. I have a concern in the comment about the immediate encoding for C.LWSP.

Shouldn't rv32c-invalid.s also check for cases like c.lwsp with rd as x0?

Thanks for clarifying the reasoning behind implySP - I can see now why you implemented things that way. Let me think a little more about it.

lib/Target/RISCV/RISCVInstrFormatsC.td
52 ↗(On Diff #124038)

Perhaps phrase as something like "The immediate value encoding differs for each instruction, so each subclass is responsible for setting the appropriate bits in the Inst field." In fact, it would be even better to document the bits that must be set, e.g. Inst{12-10}.

lib/Target/RISCV/RISCVInstrInfoC.td
113 ↗(On Diff #124038)

What about Inst{12}? I think we need more tests in rv32c-valid.s that verify the immediate encoding is correct.

shiva0217 edited the summary of this revision. (Show Details)

Hi Alex, yes, the test case for operand should not x0 is missing. New revision updates the test case and adds GPRNonX0 register class for it.

lib/Target/RISCV/RISCVInstrFormatsC.td
52 ↗(On Diff #124038)

Update the comment and specify the bits need to be set for each instruction.

lib/Target/RISCV/RISCVInstrInfoC.td
113 ↗(On Diff #124038)

The Inst{12} is set by RVInst16CI due to the bit encoding is the same for each instruction in CI format.

asb accepted this revision.Thu, Dec 7, 4:50 AM

Thanks Shiva. I think we may revisit the "implied" SP operands later, but I'll go ahead and commit as-is.

This revision was automatically updated to reflect the committed changes.