Decoding to MSRpstate should occur only if <op1><op2> represents a valid <pstatefield>, else the instruction should be decoded to a generic MSR.
Details
- Reviewers
t.p.northover jmolloy - Commits
- rG097adfb98cb6: [AArch64] Fix problems in decoding generic MSR instructions
rG182b05784a72: [TableGen] Improve decoding options for non-orthogonal instructions
rL242276: [AArch64] Fix problems in decoding generic MSR instructions
rL242274: [TableGen] Improve decoding options for non-orthogonal instructions
Diff Detail
- Repository
- rL LLVM
Event Timeline
Any feedback on this patch? I realize it is an ugly one but I would appreciate any hints how it could be done in a cleaner way.
Thanks,
Petr Pavlu
Adding Tim and James to review the patch.
lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp | ||
---|---|---|
1506 ↗ | (On Diff #18747) | This seems like a gross exaggeration. Is it possible that the name would come invalid for any other reason? |
Thank you Renato for having a look at this patch.
lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp | ||
---|---|---|
1506 ↗ | (On Diff #18747) | I am not sure if I am really answering your question but DecodeSystemPStateInstruction() is called only for MSRpstate and this instruction aliases with an extended MSR format (introduced in r218753). All cases not allowed by MSRpstate are valid MSR cases. Some more background on this code: MC instruction MSRpstate is defined to have: Inst{31-19} = 0b1101010100000; Inst{18-16} = pstatefield{5-3}; Inst{15-12} = 0b0100; Inst{11-8} = imm; Inst{7-5} = pstatefield{2-0}; Inst{4-0} = 0b11111 A problem here is that there does not seem to be a good way how to specify that only a subset of pstatefield is allowed. Disassembling a bitpattern that should be handled as an extended MSR (but which aliases MSRpstate): $ echo "0x1f 0x40 0x00 0xd5" | ./llvm-mc -triple=aarch64 -show-inst -show-encoding -disassemble -debug Args: ./llvm-mc -triple=aarch64 -show-inst -show-encoding -disassemble -debug .text 0: OPC_ExtractField(26, 3): 5 3: OPC_FilterValue(2, 1110): FAIL: continuing at 1117 1117: OPC_FilterValue(3, 28396): FAIL: continuing at 29517 29517: OPC_FilterValue(4, 447): FAIL: continuing at 29968 29968: OPC_FilterValue(5, 474): PASS: continuing at 29972 29972: OPC_ExtractField(29, 3): 6 29975: OPC_FilterValue(0, 4): FAIL: continuing at 29983 29983: OPC_FilterValue(1, 39): FAIL: continuing at 30026 30026: OPC_FilterValue(2, 17): FAIL: continuing at 30047 30047: OPC_FilterValue(4, 4): FAIL: continuing at 30055 30055: OPC_FilterValue(5, 39): FAIL: continuing at 30098 30098: OPC_FilterValue(6, 10539): PASS: continuing at 30102 30102: OPC_ExtractField(21, 5): 8 30105: OPC_FilterValue(0, 30): FAIL: continuing at 30139 30139: OPC_FilterValue(1, 10): FAIL: continuing at 30153 30153: OPC_FilterValue(2, 11): FAIL: continuing at 30168 30168: OPC_FilterValue(5, 30): FAIL: continuing at 30202 30202: OPC_FilterValue(8, 122): PASS: continuing at 30206 30206: OPC_ExtractField(0, 8): 31 30209: OPC_FilterValue(95, 11): FAIL: continuing at 30224 30224: OPC_FilterValue(159, 11): FAIL: continuing at 30240 30240: OPC_FilterValue(191, 11): FAIL: continuing at 30256 30256: OPC_FilterValue(223, 11): FAIL: continuing at 30272 30272: OPC_ExtractField(0, 5): 31 30275: OPC_FilterValue(31, 33): PASS: continuing at 30279 30279: OPC_ExtractField(12, 4): 4 30282: OPC_FilterValue(2, 11): FAIL: continuing at 30297 30297: OPC_FilterValue(4, 11): PASS: continuing at 30301 30301: OPC_CheckField(19, 1, 0, 5): FieldValue = 0, ExpectedValue = 0: PASS 30307: OPC_Decode: opcode 1124, using decoder 216 ----- DECODE SUCCESSFUL ----- <stdin>:1:1: warning: invalid instruction encoding 0x1f 0x40 0x00 0xd5 ^ My understanding of the generated disassembler is that it works in two phases. In the first phase (before "DECODE SUCCESSFUL") the disassembler matches all constant bits and determines an MC opcode. In the second phase it calls a decode method specific to the determined opcode. In this case it calls explicit decode method DecodeSystemPStateInstruction(), it sees that pstatefield is not one of SPSel, DAIFSet, DAIFClr and rejects the instruction. No programmatic limitation on field values (without changing the resulting opcode) in the first phase, or a jump from the second phase back to the first one to handle the instruction as an extended MSR (after it is rejected as MSRpstate) seems possible. I also considered an option to instantiate only the allowed MSR (immed) instructions (e.g. to have MSRpstateSPSel, MSRpstateDAIFSet, MSRpstateDAIFClr) but that does not look right because it creates a new MC opcode for each allowed pstatefield value. A similar problem exists in the ARM backend where the preload hints fit into holes in the encodings of the LDRH/STRH instructions. It seems to be handled similarly in the instruction decode method. |
@t.p.northover, @jmolloy, please have a look. I'm probably being overzealous, but I'd rather be safe than sorry.
lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp | ||
---|---|---|
1506 ↗ | (On Diff #18747) | I understand why it's safe to do that in your case, but I worry that there would be other cases where an invalid name won't match in your pattern. What I'm saying is that I'd like to see a way of that function return Invalid for all the cases where it IS invalid, even with your extension. I don't know all the variations that function can cope with, but I'd be surprised if all validation on all fields was made before entering this function. You may get it right on your examples, but that doesn't mean there are other examples where that would get it wrong. Basically, you need an active selection to make a conscious choice of acceptance and reject the unknown, instead of blindly convert whatever comes into an MSR, even if it was an MSR to begin with, it could have wrong operand values, etc. Tim and James may have a better idea of how to do that. |
Updated patch adds a check that the instruction coming in DecodeSystemPStateInstruction() is really MSR (immediate) (MSRpstate in LLVM), which means it can be interpreted as the extended MSR if pstatefield is not allocated. I think it would be a logical error if the method was called for non-MSR (immediate) instructions so the code asserts this condition (instead of returning Fail).
The patch also simplifies adding of the general register operand, it is always XZR.
lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp | ||
---|---|---|
1506 ↗ | (On Diff #18747) | Hi Petr, Renato, thanks for pointing me at this. Sorry it took so long. I think this approach is fundamentally flawed, and isn't good enough as a workaround for tablegen. It assumes that this is the last operand to be decoded (implicit decoding order which the FixedLenDecoderEmitter doesn't provide), and it mutates already-decoded parts under the disassembler's feet. It's a hack, pure and simple. I think the only reasonable way to do this is to enhance TableGen with your feature. I don't think it'll be horrendously hard, but it's something that needs to be done to avoid hacks in a nice clean backend. Sorry. James |
Hi James,
Thank you for the review. The rewrite happens in a DecoderMethod of the MSRpstate instrution, not in a DecoderMethod of an operand, so there should be no assumption it is the last operand to be decoded.
However, thanks for confirming there is currently not a good way how to implement this cleanly without modifying TableGen parts. I will have a look what could be done to support this case but it will probably take me some time to get to it.
Petr
Improve decoding options for non-orthogonal instructions
When FixedLenDecoder matches an input bitpattern of form [01]+ with an
instruction bitpattern of form [01?]+ (where 0/1 are static bits and ? are
mixed/variable bits) it passes the input bitpattern to a specific instruction
decoder method which then makes a final decision whether the bitpattern is a
valid instruction or not. This means the decoder must handle all possible
values of the variable bits which sometimes leads to opcode rewrites in the
decoder method when the instructions are not fully orthogonal.
The patch provides a way for the decoder method to say that when it returns
Fail it does not necessarily mean the bitpattern is invalid, but rather that
the bitpattern is definitely not an instruction that is recognized by the
decoder method. The decoder can then try to match the input bitpattern with
other possible instruction bitpatterns.
This allows to solve a situation on AArch64 where the MSR (immediate)
instruction has form:
1101 0101 0000 0??? 0100 ???? ???1 1111
but not all values of the ? bits are allowed. The rejected values should be
handled by the extended MSR (register) instruction:
1101 0101 000? ???? ???? ???? ???? ????
The decoder will first try to decode an input bitpattern that matches both
bitpatterns as MSR (immediate) but currently this puts the decoder method of
MSR (immediate) into a situation when it must be able to decode all possible
values of the ? bits, i.e. it would need to rewrite the instruction to `MSR
(register) when it is not MSR (immediate)`.
The patch allows to specify that the decoder method cannot determine if the
instruction is valid for all variable values. The decoder method can simply
return Fail when it knows it is definitely not MSR (immediate). The decoder
will then backtrack the decoding and find that it can match the input
bitpattern with the more generic MSR (register) bitpattern too.
This behaviour could be enabled for all instruction decoders but new TryDecode
opcode is larger by 2B than the Decode opcode. Using TryDecode for all
instruction on AArch64 increased size of the decoder table from 44kB to
49kB. The expectation is that most encodings are orthogonal so this should be
enabled only in exceptional cases to prevent any slowdown with additional
matching+decoding and size bloat.
Hi Petr,
This generally looks good to me. Apart from the nits in the comments below, generally:
- The tablegen+tests changes need to be split from the ARM changes and applied in two separate patches.
- The spelling correction needs to be split out and committed as its own (trivial, needs no code review) patch.
Cheers,
James
utils/TableGen/FixedLenDecoderEmitter.cpp | ||
---|---|---|
1084 ↗ | (On Diff #29065) | Merge lines 1084 and 1085 together. |
1899 ↗ | (On Diff #29065) | You can use dyn_cast_or_null here: BitInit *BI = dyn_cast_or_null<BitInit>(TypeRecord->getValue("hasCompleteDecoder")); bool HasCompleteDecoder = BI ? BI->getValue() : true; |
1976 ↗ | (On Diff #29065) | Same comment as in line 1899. |
2179 ↗ | (On Diff #29065) | Why change SUCCESSFUL to COMPLETE here? |
Hi James,
Thank you for the review. I will split the changes into three separate patches as you suggested before committing (typo fix, TableGen, AArch64 part).
Other comments are inlined.
Thanks,
Petr
utils/TableGen/FixedLenDecoderEmitter.cpp | ||
---|---|---|
1899 ↗ | (On Diff #29065) | This would need BitInit *BI = dyn_cast_or_null<BitInit>(TypeRecord->getValue("hasCompleteDecoder")->getValue()); but TypeRecord->getValue("hasCompleteDecoder") can be nullptr so that cannot be used. |
2179 ↗ | (On Diff #29065) | I believe the original intention of this message was to give information that the "filter decoding" part in FixedLenDecoder was successful. This is ok but the bitpattern then gets passed by FixedLenDecoder to a specific decoder method, it can reject the bitpattern and so the decoding as a whole fails. This created a bit confusing output from llvm-mc -disassemble -debug when the input bitpattern was invalid. It would contain 'DECODE SUCCESSFUL' followed by 'invalid instruction encoding': $ echo "0x1f 0x40 0x00 0xd5" | llvm-mc -triple=aarch64 -show-inst -show-encoding -disassemble -debug [...] 31211: OPC_CheckField(19, 1, 0, 5): FieldValue = 0, ExpectedValue = 0: PASS 31217: OPC_Decode: opcode 1283, using decoder 220 ----- DECODE SUCCESSFUL ----- <stdin>:1:1: warning: invalid instruction encoding 0x1f 0x40 0x00 0xd5 ^ The patch changed this to printing 'DECODE COMPLETE' when FixedLenDecoder fully finished the decoding (including running any decoding methods). However, it might be better to simply remove this message because it does not provide much information (OPC_Decode implies end of decoding similarly to OPC_Fail) and additionally extend the OPC_Decode debug message with a result returned from the decoding method. I will update the patch. |