This patch fixes a crash when doing "llvm-objdump -D --mattr=+experimental-v"
against an object file which happens to keep a word that can be decoded to
VSETVLI & VSETIVLI with reserved vlmul[2:0]=4. All vtype values with
reserved fields (vlmul[2:0]=4, vsew[2:0]=0b1xx, non-zero bits 8/9/10) are
printed to raw immediate.
Details
Diff Detail
Event Timeline
llvm/test/tools/llvm-objdump/RISCV/vsetvli.test | ||
---|---|---|
1 ↗ | (On Diff #389730) | I'm not at all familiar with what's going on in the disassembly, so can't review the rest of this patch, but could you use llvm-mc to create the test input file at runtime, rather than checking in an opaque binary? Also, I recommend: |
llvm/test/tools/llvm-objdump/RISCV/vsetvli.test | ||
---|---|---|
1 ↗ | (On Diff #389730) |
|
llvm/test/tools/llvm-objdump/RISCV/vsetvli.s | ||
---|---|---|
5–7 ↗ | (On Diff #389748) | Do you need to test the addresses/raw instruction bytes for this test to be valid? Also, can you use CHECK-NEXT for the second and third lines here? |
llvm/test/tools/llvm-objdump/RISCV/vsetvli.s | ||
---|---|---|
4–6 ↗ | (On Diff #389767) | Unless you specify --match-full-lines to FileCheck, there's an implicit {{.*}} at the start and end of the line, so you just omit that regex pattern. |
Thanks, test looks fine. I'll leave it to someone who understands the rest of the code to review that though.
llvm/test/tools/llvm-objdump/RISCV/vsetvli.s | ||
---|---|---|
1 ↗ | (On Diff #389795) | You don't need a temporary file |
5 ↗ | (On Diff #389795) | Is this really the right way to disassemble this? I'd expect it to just be disassembled like any other illegal instruction, not as a real instruction with some arbitrary hard-coded human-readable string injected in the middle of it. For all we know the encoding could be used for something completely different in future. |
8 ↗ | (On Diff #389795) | This section directive is not necessary |
llvm/test/tools/llvm-objdump/RISCV/vsetvli.s | ||
---|---|---|
5 ↗ | (On Diff #389795) | So I will decode it to <unknown>, which I think it is better than crash. |
llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp | ||
---|---|---|
427 | I don't think this is the right way to do this. I think the right way is to change the DecoderMethod for VTypeIOp to a new function instead of decodeUImmOperand<11>. The new decoder function can return MCDisassembler::Fail; |
Looking at the vector spec, what about SEW? All of 100-111 are currently reserved. And why's vtypei 11 bits; isn't it 9 in the spec (12 I could understand if zero-extending to fill the I-type value, but 11 is odd)? Maybe I'm missing something, just glancing through.
llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp | ||
---|---|---|
280–281 | I'd just stop at "is reserved", the latter part of the sentence applies to any reserved encoding so doesn't need special mention, just that it's LMUL=4 (technically vlmul[2:0]; LMUL is the name for the decoded value, vlmul is the actual encoded thing) we're checking for here. |
vsetvli uses 11 bits zimm[10:0] (bits 30:20 of the instruction), vsetivli uses 10 bits zimm[9:0] (bits 29:20 of the instruction). vsetvli is the older instruction and uses more bits. I assume that's where the 11 came from.
llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp | ||
---|---|---|
280–281 | Agree this should mention vlmul not LMUL. I should have caught that. |
Ok. What do the other bit(s) mean? Are they ignored or treated as invalid (until such time as they are given meaning)? I can't obviously see a specification for how the bits are actually encoded in zimm[10:0], just "The new vtype setting is encoded in the immediate fields of vsetvli and vsetivli", which I guess just means it's the low bits of vtype, including bits 8 through 9/10 which are defined to be 0 and so should be checked?
The Spike implementation seems to be that all immediates are legal instructions, but vill gets set if vlmul, vsew or bits 8 and up are invalid. So god knows if we should in fact be disassembling it as vsetvli somehow after all, but if so then the output it gives needs to be valid input, and there is no assembly defined in the vector spec that would give rise to such encodings. This is what you get for not having a complete spec, and not having any Sail to act as a stand in for complete prose...
One option would be to define an alternative non-canonical (except for illegal values) vsetvli rd, rs1, 0xabc syntax, much like we do for CSRs (but they don't have the weirdness of lmul/sew/ta/ma syntax).
In fact, that seems to be exactly what binutils does, from a quick grep of vsetvli in the codebase (see gas/testsuite/gas/riscv/vector-insns.[sd] for their test that includes testing precisely this issue).
I think there's something funny about the objdump behavior. The disassembly only prints the immediate for 0x400-0x7ff. Not for 0x100-0x3ff which also have extra bits.
I've notified some of SiFive's binutils developers about the issues. Including that vlmul=0b100 causes objdump to print a regular vsetvli with (null) where the lmul string should go. They will be printing the immediate form.
So I think the LLVM disassembler should decode any immediate value. The RISCVInstPrinter should print the immediate form if any extra bits or reserved encodings are used. The assembly parser should be updated to accept the immediate form if it doesn't already.
So how about I do "RISCVInstPrinter should print the immediate form" in current patch? And I will make another patch for "assembly parser should be updated to accept the immediate form" ?
Done. I have uploaded the patch about RISCVInstPrinter should print the immediate form, you are appreciated to have a check.
Do you plan to address all of the other reserved encodings in the printer too? Bits 8, 9, or 10 being non-zero. vsew=0b1xx. They can be other patches I would just like to know if I need to find someone to do it.
llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp | ||
---|---|---|
174 ↗ | (On Diff #390559) | Use RISCVVType::getVLMUL and check for VLMUL::LMUL_RESERVED |
I have handled all reserved cases in current patch, along with corresponding tests for each. Thanks for your suggestion.
llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp | ||
---|---|---|
176 ↗ | (On Diff #390599) | Bit 10 doesn't exist in the immediate for vsetivli, though presumably it's always zero extended due to being a uimm so it doesn't matter. I'd have done (Imm >> 8) != 0 instead though as it avoids needing to care about exactly how many bits to check, and matches the assert in printFenceArg. This does also make the use of decodeUImmOperand<11> for vsetivli's operand a bit odd, but it does work given the N is just for an assert. When adding the immediate integer parsing support though you will discover that VTypeIAsmOperand, and thus VTypeIOp, need splitting so that you can ban 11-bit immediate arguments to vsetivli. |
And not sure why you committed this with "Reviewed By: jhenderson, jrtc27, craig.topper", Craig was the only one to approve it, and James somewhat by saying an older version of the test looked fine
I am not clear about the rule of the Reviewed By: list is only for who approve patches. Since you and Craig and James made comments, I thought you all have read (==reviewed) the code. I will pay attention to that next time.
llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp | ||
---|---|---|
176 ↗ | (On Diff #390599) |
Changed to Imm >> 8 in 622d6894801b19e795737e133f5963d248de45af |
I'd just stop at "is reserved", the latter part of the sentence applies to any reserved encoding so doesn't need special mention, just that it's LMUL=4 (technically vlmul[2:0]; LMUL is the name for the decoded value, vlmul is the actual encoded thing) we're checking for here.