This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Fix problems in handling generic MSR/MRS instructions
ClosedPublic

Authored by petpav01 on Jan 26 2015, 3:29 AM.

Diff Detail

Event Timeline

petpav01 updated this revision to Diff 18747.Jan 26 2015, 3:29 AM
petpav01 retitled this revision from to [AArch64] Fix problems in handling generic MSR/MRS instructions.
petpav01 updated this object.
petpav01 edited the test plan for this revision. (Show Details)
petpav01 added a subscriber: Unknown Object (MLST).

Any feedback on this patch? I realize it is an ugly one but I would appreciate any hints how it could be done in a cleaner way.

Thanks,
Petr Pavlu

rengolin set the repository for this revision to rL LLVM.
rengolin added a subscriber: rengolin.

Adding Tim and James to review the patch.

lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
1506

This seems like a gross exaggeration. Is it possible that the name would come invalid for any other reason?

Thank you Renato for having a look at this patch.

lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
1506

I am not sure if I am really answering your question but DecodeSystemPStateInstruction() is called only for MSRpstate and this instruction aliases with an extended MSR format (introduced in r218753). All cases not allowed by MSRpstate are valid MSR cases.

Some more background on this code:

MC instruction MSRpstate is defined to have:

Inst{31-19} = 0b1101010100000;
Inst{18-16} = pstatefield{5-3};
Inst{15-12} = 0b0100;
Inst{11-8} = imm;
Inst{7-5} = pstatefield{2-0};
Inst{4-0} = 0b11111

A problem here is that there does not seem to be a good way how to specify that only a subset of pstatefield is allowed.

Disassembling a bitpattern that should be handled as an extended MSR (but which aliases MSRpstate):

$ echo "0x1f 0x40 0x00 0xd5" | ./llvm-mc -triple=aarch64 -show-inst -show-encoding -disassemble -debug
Args: ./llvm-mc -triple=aarch64 -show-inst -show-encoding -disassemble -debug 
        .text
0: OPC_ExtractField(26, 3): 5
3: OPC_FilterValue(2, 1110): FAIL: continuing at 1117
1117: OPC_FilterValue(3, 28396): FAIL: continuing at 29517
29517: OPC_FilterValue(4, 447): FAIL: continuing at 29968
29968: OPC_FilterValue(5, 474): PASS: continuing at 29972
29972: OPC_ExtractField(29, 3): 6
29975: OPC_FilterValue(0, 4): FAIL: continuing at 29983
29983: OPC_FilterValue(1, 39): FAIL: continuing at 30026
30026: OPC_FilterValue(2, 17): FAIL: continuing at 30047
30047: OPC_FilterValue(4, 4): FAIL: continuing at 30055
30055: OPC_FilterValue(5, 39): FAIL: continuing at 30098
30098: OPC_FilterValue(6, 10539): PASS: continuing at 30102
30102: OPC_ExtractField(21, 5): 8
30105: OPC_FilterValue(0, 30): FAIL: continuing at 30139
30139: OPC_FilterValue(1, 10): FAIL: continuing at 30153
30153: OPC_FilterValue(2, 11): FAIL: continuing at 30168
30168: OPC_FilterValue(5, 30): FAIL: continuing at 30202
30202: OPC_FilterValue(8, 122): PASS: continuing at 30206
30206: OPC_ExtractField(0, 8): 31
30209: OPC_FilterValue(95, 11): FAIL: continuing at 30224
30224: OPC_FilterValue(159, 11): FAIL: continuing at 30240
30240: OPC_FilterValue(191, 11): FAIL: continuing at 30256
30256: OPC_FilterValue(223, 11): FAIL: continuing at 30272
30272: OPC_ExtractField(0, 5): 31
30275: OPC_FilterValue(31, 33): PASS: continuing at 30279
30279: OPC_ExtractField(12, 4): 4
30282: OPC_FilterValue(2, 11): FAIL: continuing at 30297
30297: OPC_FilterValue(4, 11): PASS: continuing at 30301
30301: OPC_CheckField(19, 1, 0, 5): FieldValue = 0, ExpectedValue = 0: PASS
30307: OPC_Decode: opcode 1124, using decoder 216
----- DECODE SUCCESSFUL -----
<stdin>:1:1: warning: invalid instruction encoding
0x1f 0x40 0x00 0xd5
^

My understanding of the generated disassembler is that it works in two phases. In the first phase (before "DECODE SUCCESSFUL") the disassembler matches all constant bits and determines an MC opcode. In the second phase it calls a decode method specific to the determined opcode. In this case it calls explicit decode method DecodeSystemPStateInstruction(), it sees that pstatefield is not one of SPSel, DAIFSet, DAIFClr and rejects the instruction. No programmatic limitation on field values (without changing the resulting opcode) in the first phase, or a jump from the second phase back to the first one to handle the instruction as an extended MSR (after it is rejected as MSRpstate) seems possible.

I also considered an option to instantiate only the allowed MSR (immed) instructions (e.g. to have MSRpstateSPSel, MSRpstateDAIFSet, MSRpstateDAIFClr) but that does not look right because it creates a new MC opcode for each allowed pstatefield value.

A similar problem exists in the ARM backend where the preload hints fit into holes in the encodings of the LDRH/STRH instructions. It seems to be handled similarly in the instruction decode method.

@t.p.northover, @jmolloy, please have a look. I'm probably being overzealous, but I'd rather be safe than sorry.

lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
1506

I understand why it's safe to do that in your case, but I worry that there would be other cases where an invalid name won't match in your pattern. What I'm saying is that I'd like to see a way of that function return Invalid for all the cases where it IS invalid, even with your extension.

I don't know all the variations that function can cope with, but I'd be surprised if all validation on all fields was made before entering this function. You may get it right on your examples, but that doesn't mean there are other examples where that would get it wrong.

Basically, you need an active selection to make a conscious choice of acceptance and reject the unknown, instead of blindly convert whatever comes into an MSR, even if it was an MSR to begin with, it could have wrong operand values, etc.

Tim and James may have a better idea of how to do that.

petpav01 updated this revision to Diff 20995.Mar 2 2015, 3:26 AM
petpav01 removed rL LLVM as the repository for this revision.

Updated patch adds a check that the instruction coming in DecodeSystemPStateInstruction() is really MSR (immediate) (MSRpstate in LLVM), which means it can be interpreted as the extended MSR if pstatefield is not allocated. I think it would be a logical error if the method was called for non-MSR (immediate) instructions so the code asserts this condition (instead of returning Fail).

The patch also simplifies adding of the general register operand, it is always XZR.

petpav01 set the repository for this revision to rL LLVM.Mar 2 2015, 3:27 AM
jmolloy requested changes to this revision.Mar 16 2015, 1:10 AM
jmolloy edited edge metadata.
jmolloy added inline comments.
lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
1506

Hi Petr,

Renato, thanks for pointing me at this. Sorry it took so long.

I think this approach is fundamentally flawed, and isn't good enough as a workaround for tablegen. It assumes that this is the last operand to be decoded (implicit decoding order which the FixedLenDecoderEmitter doesn't provide), and it mutates already-decoded parts under the disassembler's feet. It's a hack, pure and simple.

I think the only reasonable way to do this is to enhance TableGen with your feature. I don't think it'll be horrendously hard, but it's something that needs to be done to avoid hacks in a nice clean backend.

Sorry.

James

This revision now requires changes to proceed.Mar 16 2015, 1:10 AM

Hi James,

Thank you for the review. The rewrite happens in a DecoderMethod of the MSRpstate instrution, not in a DecoderMethod of an operand, so there should be no assumption it is the last operand to be decoded.

However, thanks for confirming there is currently not a good way how to implement this cleanly without modifying TableGen parts. I will have a look what could be done to support this case but it will probably take me some time to get to it.

Petr

petpav01 updated this revision to Diff 29065.Jul 6 2015, 12:38 AM
petpav01 updated this object.
petpav01 edited edge metadata.
petpav01 removed rL LLVM as the repository for this revision.

Improve decoding options for non-orthogonal instructions

When FixedLenDecoder matches an input bitpattern of form [01]+ with an
instruction bitpattern of form [01?]+ (where 0/1 are static bits and ? are
mixed/variable bits) it passes the input bitpattern to a specific instruction
decoder method which then makes a final decision whether the bitpattern is a
valid instruction or not. This means the decoder must handle all possible
values of the variable bits which sometimes leads to opcode rewrites in the
decoder method when the instructions are not fully orthogonal.

The patch provides a way for the decoder method to say that when it returns
Fail it does not necessarily mean the bitpattern is invalid, but rather that
the bitpattern is definitely not an instruction that is recognized by the
decoder method. The decoder can then try to match the input bitpattern with
other possible instruction bitpatterns.

This allows to solve a situation on AArch64 where the MSR (immediate)
instruction has form:
1101 0101 0000 0??? 0100 ???? ???1 1111
but not all values of the ? bits are allowed. The rejected values should be
handled by the extended MSR (register) instruction:
1101 0101 000? ???? ???? ???? ???? ????

The decoder will first try to decode an input bitpattern that matches both
bitpatterns as MSR (immediate) but currently this puts the decoder method of
MSR (immediate) into a situation when it must be able to decode all possible
values of the ? bits, i.e. it would need to rewrite the instruction to `MSR
(register) when it is not MSR (immediate)`.

The patch allows to specify that the decoder method cannot determine if the
instruction is valid for all variable values. The decoder method can simply
return Fail when it knows it is definitely not MSR (immediate). The decoder
will then backtrack the decoding and find that it can match the input
bitpattern with the more generic MSR (register) bitpattern too.

This behaviour could be enabled for all instruction decoders but new TryDecode
opcode is larger by 2B than the Decode opcode. Using TryDecode for all
instruction on AArch64 increased size of the decoder table from 44kB to
49kB. The expectation is that most encodings are orthogonal so this should be
enabled only in exceptional cases to prevent any slowdown with additional
matching+decoding and size bloat.

jmolloy accepted this revision.Jul 6 2015, 2:36 AM
jmolloy edited edge metadata.

Hi Petr,

This generally looks good to me. Apart from the nits in the comments below, generally:

  • The tablegen+tests changes need to be split from the ARM changes and applied in two separate patches.
  • The spelling correction needs to be split out and committed as its own (trivial, needs no code review) patch.

Cheers,

James

utils/TableGen/FixedLenDecoderEmitter.cpp
1084 ↗(On Diff #29065)

Merge lines 1084 and 1085 together.

1899 ↗(On Diff #29065)

You can use dyn_cast_or_null here:

BitInit *BI = dyn_cast_or_null<BitInit>(TypeRecord->getValue("hasCompleteDecoder")); 
bool HasCompleteDecoder = BI ? BI->getValue() : true;
1976 ↗(On Diff #29065)

Same comment as in line 1899.

2179 ↗(On Diff #29065)

Why change SUCCESSFUL to COMPLETE here?

This revision is now accepted and ready to land.Jul 6 2015, 2:36 AM

Hi James,

Thank you for the review. I will split the changes into three separate patches as you suggested before committing (typo fix, TableGen, AArch64 part).
Other comments are inlined.

Thanks,
Petr

utils/TableGen/FixedLenDecoderEmitter.cpp
1899 ↗(On Diff #29065)

This would need

BitInit *BI = dyn_cast_or_null<BitInit>(TypeRecord->getValue("hasCompleteDecoder")->getValue());

but TypeRecord->getValue("hasCompleteDecoder") can be nullptr so that cannot be used.

2179 ↗(On Diff #29065)

I believe the original intention of this message was to give information that the "filter decoding" part in FixedLenDecoder was successful. This is ok but the bitpattern then gets passed by FixedLenDecoder to a specific decoder method, it can reject the bitpattern and so the decoding as a whole fails.

This created a bit confusing output from llvm-mc -disassemble -debug when the input bitpattern was invalid. It would contain 'DECODE SUCCESSFUL' followed by 'invalid instruction encoding':

$ echo "0x1f 0x40 0x00 0xd5" | llvm-mc -triple=aarch64 -show-inst -show-encoding -disassemble -debug
[...]
31211: OPC_CheckField(19, 1, 0, 5): FieldValue = 0, ExpectedValue = 0: PASS
31217: OPC_Decode: opcode 1283, using decoder 220
----- DECODE SUCCESSFUL -----
<stdin>:1:1: warning: invalid instruction encoding
0x1f 0x40 0x00 0xd5
^

The patch changed this to printing 'DECODE COMPLETE' when FixedLenDecoder fully finished the decoding (including running any decoding methods).

However, it might be better to simply remove this message because it does not provide much information (OPC_Decode implies end of decoding similarly to OPC_Fail) and additionally extend the OPC_Decode debug message with a result returned from the decoding method. I will update the patch.

petpav01 updated this revision to Diff 29088.Jul 6 2015, 7:49 AM
petpav01 edited edge metadata.

Updated patch fixes formatting problems and corrects the debugging output.

I think this is reasonable. Please go ahead and commit.

This revision was automatically updated to reflect the committed changes.