Implement c.lui immediate constraint to [1, 31] and [0xfffe0, 0xfffff].
[0xfffe0, 0xfffff] eventually will encode as [32, 63] as ISA describe.
The transformation from lui to c.lui could check the immediate range of c.lui directly.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Thanks Shiva. I've left comments inline regarding the naming convention for this operand type, and the printing policy. As I say, I think we should let c.lui's operand print in the same way as lui for now. Consistently printing hex for lui and c.lui (as I think gnu objdump does) probably makes sense, but lets do that in a separate patch and do it for both instructions at the same time.
lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp | ||
---|---|---|
276 | Given that 'UImm6' is misleading I'd be tempted to just call this and the other related methods`isCLUIImm`. If preferring a isUImm naming scheme, isUImm20CLUI would actually be most consistent with the naming convention used for other asm operands, but that is also confusing. c.lui is a weird special case, so lets give it a name that reflects that. | |
lib/Target/RISCV/InstPrinter/RISCVInstPrinter.cpp | ||
92 ↗ | (On Diff #132541) | I don't think that the printing policy for c.lui should differ from lui. i.e. if we print it as '1048544' for lui then we should do the same for c.lui. We might later want to change the cases where hex is printed rather than decimal, but that would be better in a separate patch and applied consistently. |
lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp | ||
135 | Drop unnecessary parens: ((Res >= 0xfffe0 && Res <= 0xfffff)) -> (Res >= 0xfffe0 && Res <= 0xfffff) |
lib/Target/RISCV/RISCVInstrInfoC.td | ||
---|---|---|
52 | Thanks for the patch Shiva, I will launch a run to confirm the issue is resolved. I am confused by the RISCV GCC comment, we are not doing this for compatibility with GCC, it is following the ISA right, though the ISA description for c.lui is not very clear. For example, c.srli better describes the accepted range. Shouldn't the ISA document be updated? |
Ana, can you share the encoded and decoded instruction which causes the qemu crash? With this patch it should be possible to generate, the same set of possible c.lui encodings as before (all legal), it's just the mapping from logical immediate operand -> encoded form that changes. i.e. before and after this patch, my expectation is it is _impossible_ to generate a c.lui that cases an illegal instruction assertion. Could well be a qemu bug of course.
I don't have time to check the git version of the spec, but it seems the error in 2.1 is listing c.lui immediate as nzuimm in the long-form definition rather than nzimm.
Hi Ana. The ISA probably not describe very straightforward. I update the comments try to illustrate more detail. Could you help me to check is the updated comments OK?
Hi Alex, I think you are right, the crashes were due to c.addi and c.addi16sp, which Shiva fixed in separate patch. I tested both patches together tough.
I think what broke was the assembler step when we were hand-modifying the test case to run on qemu to find out the offending instructions.
GCC assembler rejected:
echo 'c.lui x8, 32' |riscv32-unknown-linux-gnu-as -o sai -c -mabi=ilp32 -march=rv32gc
{standard input}: Assembler messages:
{standard input}:1: Error: illegal operands `c.lui x8,32'
while LLVM is accepting it:
echo 'c.lui x8, 32' |llvm-mc -triple=riscv32 -mattr=+c --show-encoding
.text c.lui s0, 32 # encoding: [0x01,0x74]
which is fixed with this patch:
echo 'c.lui x8, 32' |llvm-mc -triple=riscv32 -mattr=+c --show-encoding
.text
<stdin>:1:11: error: immediate must be in [0xfffe0, 0xfffff] or [1, 31]
c.lui x8, 32
LGTM.
lib/Target/RISCV/RISCVInstrInfoC.td | ||
---|---|---|
52 | just some typo (checks, describes) and missing end point '.'. |
Only remaining concern:
RISCVMCCodeEmitter::getCLUIImmOpValue doesn't seem to be necessary - or at least, modifying c_lui_imm so it doesn't use a custom EncoderMethod still results in all tests passing. From my understanding of the code, I think the custom getCLUImmOpValue isn't necessary and can be removed. If I'm missing something and it is necessary, then the tests should be updated so they fail without it.
Hi Alex. You're right. RISCVGenMCCodeEmitter.inc generated by tablegen will create mask according to the bit-field width defined in *.td. Upper bit's will filter out by the mask, so we don't have to define custom encoding method for c.lui. Thanks!
Looks good to me now. I had a go at rewording the c_lui_imm comment - feel free to take from that suggestion if you think it's any clearer.
lib/Target/RISCV/RISCVInstrInfoC.td | ||
---|---|---|
51–54 | I had a go at rewording - does this seem any clearer to you? It's hard to explain concisely. c_lui_imm checks the immediate range is in [1, 31] or [0xfffe0, 0xfffff]. The RISC-V ISA describes the constraint as [1, 63], with that value being loaded in to bits 17-12 of the destination register and sign extended from bit 17. Therefore, this 6-bit immediate can represent values in the ranges [1, 31] and [0xfffe0, 0xfffff]. |
Given that 'UImm6' is misleading I'd be tempted to just call this and the other related methods`isCLUIImm`. If preferring a isUImm naming scheme, isUImm20CLUI would actually be most consistent with the naming convention used for other asm operands, but that is also confusing. c.lui is a weird special case, so lets give it a name that reflects that.