This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
test/TableGen/
-
TableGen/
1
VarLenDecoder.td
-
utils/TableGen/
-
TableGen/
35/38
FixedLenDecoderEmitter.cpp
1/1
VarLenCodeEmitterGen.h
-
VarLenCodeEmitterGen.cpp

Differential D120958

[TableGen] Add support for variable length instruction in decoder generator
ClosedPublic

Authored by 0x59616e on Mar 3 2022, 6:30 PM.

Download Raw Diff

Details

Reviewers

myhsu
ricky26
RKSimon
craig.topper
jrtc27

Commits

rG28e850a8da51: [TableGen] Add support for variable length instruction in decoder generator

Summary

To support variable length instructions, I think of them as fixed length instructions with the "maximum length". For example, if there're three instructions with 2, 6 and 9 bytes, we can fit them into the algorithm by treating them all as 9 bytes.

Also, since we can't know the length of the instruction in advance, there is a function object with type void(APInt &, uint64_t) added in the parameter list of decodeInstruction and fieldFromInstruction. We can use this to supply the additional bits the decoder needs after we know the opcode of the instruction.

Finally, InstrLenTable is added to let the decoder know the length of the instructions.

See D120960 for its usage.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

0x59616e created this revision.Mar 3 2022, 6:30 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 6:30 PM

0x59616e requested review of this revision.Mar 3 2022, 6:30 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 6:30 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

0x59616e mentioned this in D120960: [M68k][Disassembler] Adopt the new variable length decoder.Mar 3 2022, 6:38 PM

0x59616e edited the summary of this revision. (Show Details)

0x59616e edited the summary of this revision. (Show Details)Mar 3 2022, 6:41 PM

0x59616e added a child revision: D120960: [M68k][Disassembler] Adopt the new variable length decoder.

0x59616e added reviewers: myhsu, ricky26, RKSimon, craig.topper, resistor, jrtc27.Mar 3 2022, 6:55 PM

Harbormaster completed remote builds in B152506: Diff 412884.Mar 3 2022, 7:06 PM

0x59616e edited the summary of this revision. (Show Details)Mar 10 2022, 3:38 PM

Thank you for the patch.
A high level question: It seems like generating a decoder requires two phases -- First, generates a list of OperandInfo, which is currently done by populateInstruction, then emit the real (C++) code according to these OperandInfo instances. My question is, can we use our own way -- possibly putting in a separate file -- to generate these OperandInfo before supplying them to the second phase, instead of trying to piggyback everything into the existing FixedLenDecoder framework?
Because, to be honest, I'm not a big fan of overloading the BitsInit (to carry operand info). Using a BitsInit might fit well for fixed-length instructions but I feel like there are more elegant ways to handle var-length instructions. For instance, using CGIOperandList::OperandInfo::ParseOperandName to parse suboperands rather than traversing every single in/out operands in the original instruction definition.

Also, please double check all your modifications and make sure to follow the LLVM coding style, especially the naming convention of local variables.

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
228	https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly
265	"unknown Init kind" would probably be better.
459	why do you want to change these three lines? I don't think `RV` is used anywhere else.
1347	ditto
2175–2181	The `OI` argument can be `OperandInfo &` to avoid copy.
2292	please use `llvm::function_ref` here
2295	why do we want to enlarge `insn` (on-demand) upon every field extractions? Can we resize `insn` ahead of time and do it only once?
2329	ditto

In D120958#3380177, @myhsu wrote:

Thank you for the patch.
A high level question: It seems like generating a decoder requires two phases -- First, generates a list of OperandInfo, which is currently done by populateInstruction, then emit the real (C++) code according to these OperandInfo instances. My question is, can we use our own way -- possibly putting in a separate file -- to generate these OperandInfo before supplying them to the second phase, instead of trying to piggyback everything into the existing FixedLenDecoder framework?
Because, to be honest, I'm not a big fan of overloading the BitsInit (to carry operand info). Using a BitsInit might fit well for fixed-length instructions but I feel like there are more elegant ways to handle var-length instructions. For instance, using CGIOperandList::OperandInfo::ParseOperandName to parse suboperands rather than traversing every single in/out operands in the original instruction definition.

Sorry for not having time implementing my ideas, so I just run through it quickly (I will do it maybe this Saturday or Sunday):

We can traverse the Segments of VarLenInst and use CGIOperandList::OperandInfo::ParseOperandName to get the operand number, and then add that Segments into the correspoding OpInfo according to the operand number.

For example, if we find "dst.reg" and use CGIOperandList::OperandInfo::ParseOperandName and know "Oh, this is the third operand", then we can add this into the third OpInfo.

In this way we can avoid overload BitsInit completely.

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
459	My understanding is: `SFBits` could be null pointer (take a look at the if statement below), which means `SoftFail` is not necessary. But using `getValueAsBitsInit` will trigger assertion if `SoftFail` can not be found.

0x59616e added inline comments.Mar 15 2022, 5:50 AM

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
1347	Same reason as above: `SFBits` is not necessary (if not then the if statement below is useless) but `getValueAsBitsInit` dislike that.
2295	In some cases, for example, case `MCD::OPC_CheckField`, needs to extract the bit field before we know the length of the instruction. But only two cases need this IIRC: `OPC_ExtractField` and `OPC_CheckField`. Doing this in `fieldFromInstruction` is because I'm lazy. I will fix this in the next diff. And, can we resize `insn` ahead of time and do it only once ? I think it's no. We have to know the length to resize `insn`. So we can do this only in case `MCD::OPC_Decode`.

In D120958#3382237, @0x59616e wrote:

We can traverse the Segments of VarLenInst and use CGIOperandList::OperandInfo::ParseOperandName to get the operand number, and then add that Segments into the correspoding OpInfo according to the operand number.

For example, if we find "dst.reg" and use CGIOperandList::OperandInfo::ParseOperandName and know "Oh, this is the third operand", then we can add this into the third OpInfo.

In this way we can avoid overload BitsInit completely.

This approach sounds reasonable to me

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
459	The use of `getValueAsBitsInit` might be wrong but it's irrelevant to this patch. I think it's better to put this change into another patch.

In D120958#3383624, @myhsu wrote:

In D120958#3382237, @0x59616e wrote:

We can traverse the Segments of VarLenInst and use CGIOperandList::OperandInfo::ParseOperandName to get the operand number, and then add that Segments into the correspoding OpInfo according to the operand number.

For example, if we find "dst.reg" and use CGIOperandList::OperandInfo::ParseOperandName and know "Oh, this is the third operand", then we can add this into the third OpInfo.

In this way we can avoid overload BitsInit completely.

This approach sounds reasonable to me

Great ! I'll materialize it this weekend.

address feedback

Herald added a subscriber: StephenFan. · View Herald TranscriptMar 24 2022, 3:38 PM

0x59616e added a comment.Mar 24 2022, 3:39 PM

This comment was removed by 0x59616e.

Harbormaster completed remote builds in B156154: Diff 418057.Mar 24 2022, 4:32 PM

fix a minor problem

Harbormaster completed remote builds in B156231: Diff 418152.Mar 25 2022, 1:44 AM

The logics look cleaner now.
Now a bigger question is: Should we still calling it FixedLenDecoderEmitter?

llvm/test/TableGen/VarLenEncoder.td
3 ↗	(On Diff #418152)	I still prefer the decoder's tests either go under the FixedLenDecoderEmitter's folder or at least putting into a separate file.
51 ↗	(On Diff #418152)	I don't quite understand what this `!cond` trying to do
llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
224	local variables should start with uppercase
1868	I'm not sure non-power-of-two # of (inlined) elements for SmallVector is recommended. If you're not sure what number to put, just use SmallVector<int> as recommended here.
1875	braces here can be removed
1900	please use `!OpName.empty()` or `OpName.size()`
1956	Did you use `clang-format-diff.py` rather than clang-format? I wonder why this line changed...
2360	why change to uint64_t? If you're worry about InsnType being APInt, APInt has `operator=(uint64_t)`
2387	ditto
2497	ditto
2502	"!= 0" in these two lines are redundant.
2541	why remove '&'?
2675	Could we add some comments here?
llvm/utils/TableGen/VarLenCodeEmitterGen.h
20–21	I think these forward declarations are obsolete now

address some feedbacks

move test into another file

In D120958#3415717, @myhsu wrote:

The logics look cleaner now.
Now a bigger question is: Should we still calling it FixedLenDecoderEmitter?

What about DecoderEmitter.cpp ?

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
224	Just wondering: there are some variables that is started with lowercase in the original code. For example the parameter "&def" in this function. Is that a mistake ? or is there any special naming rule that I don't know ?
228	Naming is hard.
1956	I've tried, and it remains the same. BTW I use "git clang-format". is there any difference between these two ?
2360	But it yells "conversion from ‘uint64_t’ {aka ‘long unsigned int’} to non-scalar type ‘llvm::APInt’ requested"
2502	g++ screams "no match for ‘operator\|\|’ (operand types are ‘llvm::APInt’ and ‘llvm::APInt’)"

address feedback

Harbormaster completed remote builds in B156939: Diff 419119.Mar 30 2022, 5:18 PM

In D120958#3416426, @0x59616e wrote:

In D120958#3415717, @myhsu wrote:

The logics look cleaner now.
Now a bigger question is: Should we still calling it FixedLenDecoderEmitter?

What about DecoderEmitter.cpp ?

Sounds good to me, but I would suggest you to send a RFC to the forum asking for consensus.
Especially to see if there is any comment from existing FixedLenDecoder users (e.g. AArch64 and ARM). Also, in the RFC, please briefly explain why you want to build this feature on top of FixedLenDecoder rather than writing a separate disassembler.

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
224	it's likely that those code were committed before LLVM coding style was solidified
1956	I've tried, and it remains the same. that's fine, just a minor issue BTW I use "git clang-format". is there any difference between these two ? I don't think so
2360	you're right and I was wrong about `APInt::operator=(uint64_t)`: the operator won't be used in the case of initialization (constructor will be used instead but APInt doesn't have `APInt(uint64_t)`).
2387	why did you use another PtrLen variable here?

Also, please make sure all existing disassembler tests, especially targets that use FixedLenDecoder, are passing.

myhsu added inline comments.Mar 31 2022, 10:05 AM

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
1875	I meant only braces around the for-loop, your current syntax might create ambiguity. Here is the guideline: https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements

address feedback

Harbormaster completed remote builds in B157293: Diff 419599.Mar 31 2022, 6:38 PM

In D120958#3419921, @myhsu wrote:

In D120958#3416426, @0x59616e wrote:

In D120958#3415717, @myhsu wrote:

The logics look cleaner now.
Now a bigger question is: Should we still calling it FixedLenDecoderEmitter?

What about DecoderEmitter.cpp ?

Sounds good to me, but I would suggest you to send a RFC to the forum asking for consensus.
Especially to see if there is any comment from existing FixedLenDecoder users (e.g. AArch64 and ARM). Also, in the RFC, please briefly explain why you want to build this feature on top of FixedLenDecoder rather than writing a separate disassembler.

No problem.

In D120958#3419939, @myhsu wrote:

Also, please make sure all existing disassembler tests, especially targets that use FixedLenDecoder, are passing.

Sure.

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
224	Ah, that makes sense.
2360	C++ is hard.
2387	quick answer: for debug purpose. In the original code, `Len` is the length of the instruction field. But it is changed in the function `decodeULEB128'. That will break the debug message below.

Should we rename after this patch ? or before this patch ?

rebase

Harbormaster completed remote builds in B158082: Diff 420654.Apr 5 2022, 5:42 PM

0x59616e added a child revision: D123451: [NFC] Rename `FixedLenDecoderEmitter` as `DecoderEmitter`.Apr 9 2022, 8:13 AM

0x59616e removed a child revision: D120960: [M68k][Disassembler] Adopt the new variable length decoder.

I could not remember whether I've modified this patch recently.

So I update this just for ensurance.

Harbormaster completed remote builds in B158862: Diff 421736.Apr 9 2022, 9:04 AM

ping. Any more suggestions ?

@myhsu Any more comments?

lgtm. Thanks for the patch!

This revision is now accepted and ready to land.May 1 2022, 12:33 PM

It's taken me a very long time to find time to read though this patch (I'm very sorry about that). The approach seems nice and straight forward & the generated code looks good. Thanks a lot for finding the time to get this through. 😅

In D120958#3484756, @myhsu wrote:

lgtm. Thanks for the patch!

Thanks for your patience. This patch wouldn't be here if it were not for your help ;)

In D120958#3484851, @ricky26 wrote:

It's taken me a very long time to find time to read though this patch (I'm very sorry about that). The approach seems nice and straight forward & the generated code looks good. Thanks a lot for finding the time to get this through. 😅

My pleasure ;)

This revision was landed with ongoing or failed builds.May 2 2022, 12:37 PM

Closed by commit rG28e850a8da51: [TableGen] Add support for variable length instruction in decoder generator (authored by 0x59616e). · Explain Why

This revision was automatically updated to reflect the committed changes.

0x59616e added a commit: rG28e850a8da51: [TableGen] Add support for variable length instruction in decoder generator.

foad added a subscriber: foad.May 3 2022, 1:52 AM

foad added inline comments.

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp
2360	Hi, this is causing a slight problem for us downstream because we're using a custom InsnType (not APInt). Have you seen the big comment emitted at line 2255? It explicitly says that your InsnType needs to "be constructible from a uint64_t". So please either update the comment, or stop using plain APInt as your InsnType :)
2502	Downstream the compiler complained about using mismatched types for operator&, InsnType and uint64_t. So please either change this back or at least update the comment on line 2255.

For the record, to get this to build with my downstream target, I had to add:

MyInsnType MyInsnType::operator&(const uint64_t &RHS);
bool MyInsnType::operator!=(const int &RHS);

Hello, We are maintaining a downstream version of the monorepo based on the LLVM main branch. In a recent attempt to merge the latest upstream commits
into our monorepo we came across the following test failures after your commit.
Any help would be greatly appreciated.
Thanks
Greg

FAIL: llvm_regressions :: LLVM/TableGen/VarLenDecoder.td
--------------------------------------------------------------------------------
Script:
--
: 'RUN: at line 1';   /scratch/gmiller/tools2/llvm_cgt/arm-llvm/RelWithAsserts/llvm/bin/llvm-tblgen -gen-disassembler -I /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/../../include /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td | /scratch/gmiller/tools2/llvm_cgt/arm-llvm/RelWithAsserts/llvm/bin/FileCheck /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td
--
Exit Code: 1

Command Output (stderr):
--
+ : 'RUN: at line 1'
+ /scratch/gmiller/tools2/llvm_cgt/arm-llvm/RelWithAsserts/llvm/bin/llvm-tblgen -gen-disassembler -I /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/../../include /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td
+ /scratch/gmiller/tools2/llvm_cgt/arm-llvm/RelWithAsserts/llvm/bin/FileCheck /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td
/scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td:50:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: MCD::OPC_Decode, 244, 1, 0, // Opcode: FOO16
               ^
<stdin>:72:57: note: scanning from here
/* 3 */ MCD::OPC_FilterValue, 8, 4, 0, 0, // Skip to: 12
                                                        ^
<stdin>:73:9: note: possible intended match here
/* 8 */ MCD::OPC_Decode, 245, 1, 0, // Opcode: FOO16
        ^

Input file: <stdin>
Check file: /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          67:  field.insertBits(bits, startBit, numBits); 
          68: } 
          69:  
          70: static const uint8_t DecoderTable43[] = { 
          71: /* 0 */ MCD::OPC_ExtractField, 3, 5, // Inst{7-3} ... 
          72: /* 3 */ MCD::OPC_FilterValue, 8, 4, 0, 0, // Skip to: 12 
next:50'0                                                             X error: no match found
          73: /* 8 */ MCD::OPC_Decode, 245, 1, 0, // Opcode: FOO16 
next:50'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:50'1             ?                                             possible intended match
          74: /* 12 */ MCD::OPC_FilterValue, 9, 4, 0, 0, // Skip to: 21 
next:50'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          75: /* 17 */ MCD::OPC_Decode, 246, 1, 1, // Opcode: FOO32 
next:50'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          76: /* 21 */ MCD::OPC_Fail, 
next:50'0     ~~~~~~~~~~~~~~~~~~~~~~~~
          77:  0 
next:50'0     ~~~
          78: }; 
next:50'0     ~~~
           .
           .
           .
>>>>>>

--
`

resistor removed a reviewer: resistor.May 3 2022, 9:59 AM

In D120958#3488799, @gregmiller wrote:

FAIL: llvm_regressions :: LLVM/TableGen/VarLenDecoder.td
--------------------------------------------------------------------------------
Script:
--
: 'RUN: at line 1';   /scratch/gmiller/tools2/llvm_cgt/arm-llvm/RelWithAsserts/llvm/bin/llvm-tblgen -gen-disassembler -I /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/../../include /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td | /scratch/gmiller/tools2/llvm_cgt/arm-llvm/RelWithAsserts/llvm/bin/FileCheck /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td
--
Exit Code: 1

Command Output (stderr):
--
+ : 'RUN: at line 1'
+ /scratch/gmiller/tools2/llvm_cgt/arm-llvm/RelWithAsserts/llvm/bin/llvm-tblgen -gen-disassembler -I /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/../../include /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td
+ /scratch/gmiller/tools2/llvm_cgt/arm-llvm/RelWithAsserts/llvm/bin/FileCheck /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td
/scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td:50:16: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: MCD::OPC_Decode, 244, 1, 0, // Opcode: FOO16
               ^
<stdin>:72:57: note: scanning from here
/* 3 */ MCD::OPC_FilterValue, 8, 4, 0, 0, // Skip to: 12
                                                        ^
<stdin>:73:9: note: possible intended match here
/* 8 */ MCD::OPC_Decode, 245, 1, 0, // Opcode: FOO16
        ^

Input file: <stdin>
Check file: /scratch/gmiller/tools2/llvm_cgt/llvm-project/llvm/test/TableGen/VarLenDecoder.td

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          67:  field.insertBits(bits, startBit, numBits); 
          68: } 
          69:  
          70: static const uint8_t DecoderTable43[] = { 
          71: /* 0 */ MCD::OPC_ExtractField, 3, 5, // Inst{7-3} ... 
          72: /* 3 */ MCD::OPC_FilterValue, 8, 4, 0, 0, // Skip to: 12 
next:50'0                                                             X error: no match found
          73: /* 8 */ MCD::OPC_Decode, 245, 1, 0, // Opcode: FOO16 
next:50'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:50'1             ?                                             possible intended match
          74: /* 12 */ MCD::OPC_FilterValue, 9, 4, 0, 0, // Skip to: 21 
next:50'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          75: /* 17 */ MCD::OPC_Decode, 246, 1, 1, // Opcode: FOO32 
next:50'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          76: /* 21 */ MCD::OPC_Fail, 
next:50'0     ~~~~~~~~~~~~~~~~~~~~~~~~
          77:  0 
next:50'0     ~~~
          78: }; 
next:50'0     ~~~
           .
           .
           .
>>>>>>

--
`

Looks like the opcode of FOO16 and FOO32 are off by one. I think the number of pseudo opcodes is different (from the upstream one). Are include/llvm/Support/TargetOpcodes.def or include/llvm/Target/Target.td in your downstream repo different from upstream?

In D120958#3487811, @foad wrote:
For the record, to get this to build with my downstream target, I had to add:
MyInsnType MyInsnType::operator&(const uint64_t &RHS);
bool MyInsnType::operator!=(const int &RHS);

Thanks for your information. I'll update the comment.

In D120958#3490184, @0x59616e wrote:
In D120958#3487811, @foad wrote:
For the record, to get this to build with my downstream target, I had to add:
MyInsnType MyInsnType::operator&(const uint64_t &RHS);
bool MyInsnType::operator!=(const int &RHS);
Thanks for your information. I'll update the comment.

@myhsu Should we define our own inst type that wraps APInt, or update the comment and maintain the status quo ?

skan added a subscriber: skan.May 3 2022, 11:47 PM

0x59616e mentioned this in D124987: Rename `MCFixedLenDisassembler.h` as `MCDecoderOps.h`.May 4 2022, 11:58 PM

snidertm added a subscriber: snidertm.May 5 2022, 9:18 AM

snidertm added inline comments.

llvm/test/TableGen/VarLenDecoder.td
50	Is 244 the index into the instruction table where FOO16 is expected to reside? This check on OPC_Decode seems extremely brittle. If there are any additions either in the upstream Target.td or in a downstream Target.td, won't this check and the one on line 52 will break?

I've commited a diff in the hope of fixing the test problem:

https://github.com/llvm/llvm-project/commit/9c2121b843ff7c9846a89305b4b73e3c480fe4e7

The comment is updated :
1284ce917b5a

In D120958#3502485, @0x59616e wrote:

The comment is updated :
1284ce917b5a

Thanks!

0x59616e mentioned this in rGcf0b6df6dbf5: [M68k][Disassembler] Adopt the new variable length decoder.May 14 2022, 5:45 PM

0x59616e mentioned this in rGc644488a8b8a: Rename `MCFixedLenDisassembler.h` as `MCDecoderOps.h`.

Revision Contents

Path

Size

llvm/

test/

TableGen/

VarLenDecoder.td

87 lines

utils/

TableGen/

FixedLenDecoderEmitter.cpp

652 lines

VarLenCodeEmitterGen.h

45 lines

VarLenCodeEmitterGen.cpp

48 lines

Diff 426489

llvm/test/TableGen/VarLenDecoder.td

This file was added.

				// RUN: llvm-tblgen -gen-disassembler -I %p/../../include %s \| FileCheck %s

				include "llvm/Target/Target.td"

				def ArchInstrInfo : InstrInfo { }

				def Arch : Target {
				let InstructionSet = ArchInstrInfo;
				}

				def Reg : Register<"reg">;

				def RegClass : RegisterClass<"foo", [i64], 0, (add Reg)>;

				def GR64 : RegisterOperand<RegClass>;

				class MyMemOperand<dag sub_ops> : Operand<iPTR> {
				let MIOperandInfo = sub_ops;
				dag Base;
				dag Extension;
				}

				def MemOp16: MyMemOperand<(ops GR64:$reg, i16imm:$offset)>;

				def MemOp32: MyMemOperand<(ops GR64:$reg, i32imm:$offset)>;

				class MyVarInst<MyMemOperand memory_op> : Instruction {
				dag Inst;

				let OutOperandList = (outs GR64:$dst);
				let InOperandList = (ins memory_op:$src);
				}

				def FOO16 : MyVarInst<MemOp16> {
				let Inst = (ascend
				(descend (operand "$dst", 3), 0b01000, (operand "$src.reg", 3)),
				(slice "$src.offset", 15, 0)
				);
				}
				def FOO32 : MyVarInst<MemOp32> {
				let Inst = (ascend
				(descend (operand "$dst", 3), 0b01001, (operand "$src.reg", 3)),
				(slice "$src.offset", 31, 16),
				(slice "$src.offset", 15, 0)
				);
				}

				// CHECK: MCD::OPC_ExtractField, 3, 5, // Inst{7-3} ...
				// CHECK-NEXT: MCD::OPC_FilterValue, 8, 4, 0, 0, // Skip to: 12
				// CHECK-NEXT: MCD::OPC_Decode, 244, 1, 0, // Opcode: FOO16
				snidertmUnsubmitted Not Done Reply Inline Actions Is 244 the index into the instruction table where FOO16 is expected to reside? This check on OPC_Decode seems extremely brittle. If there are any additions either in the upstream Target.td or in a downstream Target.td, won't this check and the one on line 52 will break? snidertm: Is 244 the index into the instruction table where FOO16 is expected to reside? This check on…
				// CHECK-NEXT: MCD::OPC_FilterValue, 9, 4, 0, 0, // Skip to: 21
				// CHECK-NEXT: MCD::OPC_Decode, 245, 1, 1, // Opcode: FOO32
				// CHECK-NEXT: MCD::OPC_Fail,

				// Instruction length table
				// CHECK: 27,
				// CHECK-NEXT: 43,
				// CHECK-NEXT: };

				// CHECK: case 0:
				// CHECK-NEXT: tmp = fieldFromInstruction(insn, 8, 3);
				// CHECK-NEXT: if (DecodeRegClassRegisterClass(MI, tmp, Address, Decoder) == MCDisassembler::Fail) { return MCDisassembler::Fail; }
				// CHECK-NEXT: tmp = fieldFromInstruction(insn, 0, 3);
				// CHECK-NEXT: if (DecodeRegClassRegisterClass(MI, tmp, Address, Decoder) == MCDisassembler::Fail) { return MCDisassembler::Fail; }
				// CHECK-NEXT: tmp = fieldFromInstruction(insn, 11, 16);
				// CHECK-NEXT: MI.addOperand(MCOperand::createImm(tmp));
				// CHECK-NEXT: return S;
				// CHECK-NEXT: case 1:
				// CHECK-NEXT: tmp = fieldFromInstruction(insn, 8, 3);
				// CHECK-NEXT: if (DecodeRegClassRegisterClass(MI, tmp, Address, Decoder) == MCDisassembler::Fail) { return MCDisassembler::Fail; }
				// CHECK-NEXT: tmp = fieldFromInstruction(insn, 0, 3);
				// CHECK-NEXT: if (DecodeRegClassRegisterClass(MI, tmp, Address, Decoder) == MCDisassembler::Fail) { return MCDisassembler::Fail; }
				// CHECK-NEXT: tmp = 0x0;
				// CHECK-NEXT: insertBits(tmp, fieldFromInstruction(insn, 11, 16), 16, 16);
				// CHECK-NEXT: insertBits(tmp, fieldFromInstruction(insn, 27, 16), 0, 16);
				// CHECK-NEXT: MI.addOperand(MCOperand::createImm(tmp));
				// CHECK-NEXT: return S;

				// CHECK-LABEL: case MCD::OPC_ExtractField: {
				// CHECK: makeUp(insn, Start + Len);

				// CHECK-LABEL: case MCD::OPC_CheckField: {
				// CHECK: makeUp(insn, Start + Len);

				// CHECK-LABEL: case MCD::OPC_Decode: {
				// CHECK: Len = InstrLenTable[Opc];
				// CHECK-NEXT: makeUp(insn, Len);

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp

//===------------ FixedLenDecoderEmitter.cpp - Decoder Generator ----------===// //===------------ FixedLenDecoderEmitter.cpp - Decoder Generator ----------===//

// //

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information. // See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// //

// It contains the tablegen backend that emits the decoder functions for // It contains the tablegen backend that emits the decoder functions for

// targets with fixed length instruction set. // targets with fixed length instruction set.

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#include "CodeGenInstruction.h" #include "CodeGenInstruction.h"

#include "CodeGenTarget.h" #include "CodeGenTarget.h"

#include "InfoByHwMode.h" #include "InfoByHwMode.h"

#include "VarLenCodeEmitterGen.h"

#include "llvm/ADT/APInt.h" #include "llvm/ADT/APInt.h"

#include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/ArrayRef.h"

#include "llvm/ADT/CachedHashString.h" #include "llvm/ADT/CachedHashString.h"

#include "llvm/ADT/STLExtras.h" #include "llvm/ADT/STLExtras.h"

#include "llvm/ADT/SetVector.h" #include "llvm/ADT/SetVector.h"

#include "llvm/ADT/SmallString.h" #include "llvm/ADT/SmallString.h"

#include "llvm/ADT/Statistic.h" #include "llvm/ADT/Statistic.h"

#include "llvm/ADT/StringExtras.h" #include "llvm/ADT/StringExtras.h"

▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines FixedLenDecoderEmitter(RecordKeeper &R, std::string PredicateNamespace,

GuardPrefix(std::move(GPrefix)), GuardPostfix(std::move(GPostfix)), GuardPrefix(std::move(GPrefix)), GuardPostfix(std::move(GPostfix)),

ReturnOK(std::move(ROK)), ReturnFail(std::move(RFail)), ReturnOK(std::move(ROK)), ReturnFail(std::move(RFail)),

Locals(std::move(L)) {} Locals(std::move(L)) {}

// Emit the decoder state machine table. // Emit the decoder state machine table.

void emitTable(formatted_raw_ostream &o, DecoderTable &Table, void emitTable(formatted_raw_ostream &o, DecoderTable &Table,

unsigned Indentation, unsigned BitWidth, unsigned Indentation, unsigned BitWidth,

StringRef Namespace) const; StringRef Namespace) const;

void emitInstrLenTable(formatted_raw_ostream &OS,

std::vector<unsigned> &InstrLen) const;

void emitPredicateFunction(formatted_raw_ostream &OS, void emitPredicateFunction(formatted_raw_ostream &OS,

PredicateSet &Predicates, PredicateSet &Predicates,

unsigned Indentation) const; unsigned Indentation) const;

void emitDecoderFunction(formatted_raw_ostream &OS, void emitDecoderFunction(formatted_raw_ostream &OS,

DecoderSet &Decoders, DecoderSet &Decoders,

unsigned Indentation) const; unsigned Indentation) const;

// run - Output the code emitter // run - Output the code emitter

▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines case BIT_UNSET:

break; break;

default: default:

llvm_unreachable("unexpected return value from bitFromBits"); llvm_unreachable("unexpected return value from bitFromBits");

} }

static BitsInit &getBitsField(const Record &def, StringRef str) { static BitsInit &getBitsField(const Record &def, StringRef str) {

BitsInit *bits = def.getValueAsBitsInit(str); const RecordVal *RV = def.getValue(str);

return *bits; if (BitsInit *Bits = dyn_cast<BitsInit>(RV->getValue()))

myhsuUnsubmitted

Done

const RecordVal *RV = def.getValue(str);

- if (BitsInit *bits = dyn_cast<BitsInit>(RV->getValue()))

+ if (BitsInit *Bits = dyn_cast<BitsInit>(RV->getValue()))

return *bits;

local variables should start with uppercase

myhsu: local variables should start with uppercase

0x59616eAuthorUnsubmitted

Done

Just wondering: there are some variables that is started with lowercase in the original code. For example the parameter "&def" in this function.

Is that a mistake ? or is there any special naming rule that I don't know ?

0x59616e: Just wondering: there are some variables that is started with lowercase in the original code.

myhsuUnsubmitted

Done

it's likely that those code were committed before LLVM coding style was solidified

myhsu: it's likely that those code were committed before LLVM coding style was solidified

0x59616eAuthorUnsubmitted

Done

Ah, that makes sense.

0x59616e: Ah, that makes sense.

return *Bits;

// variable length instruction

VarLenInst VLI = VarLenInst(cast<DagInit>(RV->getValue()), RV);

myhsuUnsubmitted

Done

https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

myhsu: https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators…

0x59616eAuthorUnsubmitted

Done

Naming is hard.

0x59616e: Naming is hard.

SmallVector<Init *, 16> Bits;

for (auto &SI : VLI) {

if (const BitsInit *BI = dyn_cast<BitsInit>(SI.Value)) {

for (unsigned Idx = 0U; Idx < BI->getNumBits(); ++Idx) {

Bits.push_back(BI->getBit(Idx));

}

} else if (const BitInit *BI = dyn_cast<BitInit>(SI.Value)) {

Bits.push_back(const_cast<BitInit *>(BI));

} else {

for (unsigned Idx = 0U; Idx < SI.BitWidth; ++Idx)

Bits.push_back(UnsetInit::get());

}

return *BitsInit::get(Bits);

} }

// Representation of the instruction to work on. // Representation of the instruction to work on.

typedef std::vector<bit_value_t> insn_t; typedef std::vector<bit_value_t> insn_t;

namespace { namespace {

static const uint64_t NO_FIXED_SEGMENTS_SENTINEL = -1ULL; static const uint64_t NO_FIXED_SEGMENTS_SENTINEL = -1ULL;

class FilterChooser; class FilterChooser;

/// Filter - Filter works with FilterChooser to produce the decoding tree for /// Filter - Filter works with FilterChooser to produce the decoding tree for

/// the ISA. /// the ISA.

/// ///

/// It is useful to think of a Filter as governing the switch stmts of the /// It is useful to think of a Filter as governing the switch stmts of the

/// decoding tree in a certain level. Each case stmt delegates to an inferior /// decoding tree in a certain level. Each case stmt delegates to an inferior

/// FilterChooser to decide what further decoding logic to employ, or in another /// FilterChooser to decide what further decoding logic to employ, or in another

/// words, what other remaining bits to look at. The FilterChooser eventually /// words, what other remaining bits to look at. The FilterChooser eventually

/// chooses a best Filter to do its job. /// chooses a best Filter to do its job.

/// ///

/// This recursive scheme ends when the number of Opcodes assigned to the /// This recursive scheme ends when the number of Opcodes assigned to the

myhsuUnsubmitted

Done

"unknown Init kind" would probably be better.

myhsu: "unknown Init kind" would probably be better.

/// FilterChooser becomes 1 or if there is a conflict. A conflict happens when /// FilterChooser becomes 1 or if there is a conflict. A conflict happens when

/// the Filter/FilterChooser combo does not know how to distinguish among the /// the Filter/FilterChooser combo does not know how to distinguish among the

/// Opcodes assigned. /// Opcodes assigned.

/// ///

/// An example of a conflict is /// An example of a conflict is

/// ///

/// Conflict: /// Conflict:

/// 111101000.00........00010000.... /// 111101000.00........00010000....

▲ Show 20 Lines • Show All 165 Lines • ▼ Show 20 Lines public:

void operator=(const FilterChooser &) = delete; void operator=(const FilterChooser &) = delete;

unsigned getBitWidth() const { return BitWidth; } unsigned getBitWidth() const { return BitWidth; }

protected: protected:

// Populates the insn given the uid. // Populates the insn given the uid.

void insnWithID(insn_t &Insn, unsigned Opcode) const { void insnWithID(insn_t &Insn, unsigned Opcode) const {

BitsInit &Bits = getBitsField(*AllInstructions[Opcode].EncodingDef, "Inst"); BitsInit &Bits = getBitsField(*AllInstructions[Opcode].EncodingDef, "Inst");

Insn.resize(BitWidth > Bits.getNumBits() ? BitWidth : Bits.getNumBits(),

BIT_UNSET);

// We may have a SoftFail bitmask, which specifies a mask where an encoding // We may have a SoftFail bitmask, which specifies a mask where an encoding

// may differ from the value in "Inst" and yet still be valid, but the // may differ from the value in "Inst" and yet still be valid, but the

// disassembler should return SoftFail instead of Success. // disassembler should return SoftFail instead of Success.

// //

// This is used for marking UNPREDICTABLE instructions in the ARM world. // This is used for marking UNPREDICTABLE instructions in the ARM world.

const RecordVal *RV = const RecordVal *RV =

AllInstructions[Opcode].EncodingDef->getValue("SoftFail"); AllInstructions[Opcode].EncodingDef->getValue("SoftFail");

const BitsInit *SFBits = RV ? dyn_cast<BitsInit>(RV->getValue()) : nullptr; const BitsInit *SFBits = RV ? dyn_cast<BitsInit>(RV->getValue()) : nullptr;

for (unsigned i = 0; i < BitWidth; ++i) { for (unsigned i = 0; i < Bits.getNumBits(); ++i) {

if (SFBits && bitFromBits(*SFBits, i) == BIT_TRUE) if (SFBits && bitFromBits(*SFBits, i) == BIT_TRUE)

Insn.push_back(BIT_UNSET); Insn[i] = BIT_UNSET;

myhsuUnsubmitted

Done

why do you want to change these three lines? I don't think RV is used anywhere else.

myhsu: why do you want to change these three lines? I don't think `RV` is used anywhere else.

0x59616eAuthorUnsubmitted

Done

My understanding is: SFBits could be null pointer (take a look at the if statement below), which means SoftFail is not necessary. But using getValueAsBitsInit will trigger assertion if SoftFail can not be found.

0x59616e: My understanding is: `SFBits` could be null pointer (take a look at the if statement below)…

myhsuUnsubmitted

Done

The use of getValueAsBitsInit might be wrong but it's irrelevant to this patch. I think it's better to put this change into another patch.

myhsu: The use of `getValueAsBitsInit` might be wrong but it's irrelevant to this patch. I think it's…

else else

Insn.push_back(bitFromBits(Bits, i)); Insn[i] = bitFromBits(Bits, i);

} }

// Emit the name of the encoding/instruction pair. // Emit the name of the encoding/instruction pair.

void emitNameWithID(raw_ostream &OS, unsigned Opcode) const { void emitNameWithID(raw_ostream &OS, unsigned Opcode) const {

const Record *EncodingDef = AllInstructions[Opcode].EncodingDef; const Record *EncodingDef = AllInstructions[Opcode].EncodingDef;

const Record *InstDef = AllInstructions[Opcode].Inst->TheDef; const Record *InstDef = AllInstructions[Opcode].Inst->TheDef;

if (EncodingDef != InstDef) if (EncodingDef != InstDef)

▲ Show 20 Lines • Show All 485 Lines • ▼ Show 20 Lines void FixedLenDecoderEmitter::emitTable(formatted_raw_ostream &OS,

} }

OS.indent(Indentation) << "0\n"; OS.indent(Indentation) << "0\n";

Indentation -= 2; Indentation -= 2;

OS.indent(Indentation) << "};\n\n"; OS.indent(Indentation) << "};\n\n";

} }

void FixedLenDecoderEmitter::emitInstrLenTable(

formatted_raw_ostream &OS, std::vector<unsigned> &InstrLen) const {

OS << "static const uint8_t InstrLenTable[] = {\n";

for (unsigned &Len : InstrLen) {

OS << Len << ",\n";

}

OS << "};\n\n";

}

void FixedLenDecoderEmitter:: void FixedLenDecoderEmitter::

emitPredicateFunction(formatted_raw_ostream &OS, PredicateSet &Predicates, emitPredicateFunction(formatted_raw_ostream &OS, PredicateSet &Predicates,

unsigned Indentation) const { unsigned Indentation) const {

// The predicate function is just a big switch statement based on the // The predicate function is just a big switch statement based on the

// input predicate index. // input predicate index.

OS.indent(Indentation) << "static bool checkDecoderPredicate(unsigned Idx, " OS.indent(Indentation) << "static bool checkDecoderPredicate(unsigned Idx, "

<< "const FeatureBitset &Bits) {\n"; << "const FeatureBitset &Bits) {\n";

Indentation += 2; Indentation += 2;

▲ Show 20 Lines • Show All 359 Lines • ▼ Show 20 Lines void FilterChooser::emitPredicateTableEntry(DecoderTableInfo &TableInfo,

TableInfo.Table.push_back(0); TableInfo.Table.push_back(0);

} }

void FilterChooser::emitSoftFailTableEntry(DecoderTableInfo &TableInfo, void FilterChooser::emitSoftFailTableEntry(DecoderTableInfo &TableInfo,

unsigned Opc) const { unsigned Opc) const {

const RecordVal *RV = AllInstructions[Opc].EncodingDef->getValue("SoftFail"); const RecordVal *RV = AllInstructions[Opc].EncodingDef->getValue("SoftFail");

BitsInit *SFBits = RV ? dyn_cast<BitsInit>(RV->getValue()) : nullptr; BitsInit *SFBits = RV ? dyn_cast<BitsInit>(RV->getValue()) : nullptr;

myhsuUnsubmitted

Done

ditto

myhsu: ditto

0x59616eAuthorUnsubmitted

Done

Same reason as above: SFBits is not necessary (if not then the if statement below is useless) but getValueAsBitsInit dislike that.

0x59616e: Same reason as above: `SFBits` is not necessary (if not then the if statement below is useless)…

if (!SFBits) return; if (!SFBits) return;

BitsInit *InstBits = BitsInit *InstBits =

AllInstructions[Opc].EncodingDef->getValueAsBitsInit("Inst"); AllInstructions[Opc].EncodingDef->getValueAsBitsInit("Inst");

APInt PositiveMask(BitWidth, 0ULL); APInt PositiveMask(BitWidth, 0ULL);

APInt NegativeMask(BitWidth, 0ULL); APInt NegativeMask(BitWidth, 0ULL);

for (unsigned i = 0; i < BitWidth; ++i) { for (unsigned i = 0; i < BitWidth; ++i) {

▲ Show 20 Lines • Show All 459 Lines • ▼ Show 20 Lines for (auto Opcode : Opcodes) {

errs() << " "; errs() << " ";

dumpBits( dumpBits(

errs(), errs(),

getBitsField(*AllInstructions[Opcode.EncodingID].EncodingDef, "Inst")); getBitsField(*AllInstructions[Opcode.EncodingID].EncodingDef, "Inst"));

errs() << '\n'; errs() << '\n';

} }

static std::string findOperandDecoderMethod(TypedInit *TI) { static std::string findOperandDecoderMethod(Record *Record) {

std::string Decoder; std::string Decoder;

Record *Record = cast<DefInit>(TI)->getDef();

RecordVal *DecoderString = Record->getValue("DecoderMethod"); RecordVal *DecoderString = Record->getValue("DecoderMethod");

StringInit *String = DecoderString ? StringInit *String = DecoderString ?

dyn_cast<StringInit>(DecoderString->getValue()) : nullptr; dyn_cast<StringInit>(DecoderString->getValue()) : nullptr;

if (String) { if (String) {

Decoder = std::string(String->getValue()); Decoder = std::string(String->getValue());

if (!Decoder.empty()) if (!Decoder.empty())

return Decoder; return Decoder;

} }

if (Record->isSubClassOf("RegisterOperand")) if (Record->isSubClassOf("RegisterOperand"))

Record = Record->getValueAsDef("RegClass"); Record = Record->getValueAsDef("RegClass");

if (Record->isSubClassOf("RegisterClass")) { if (Record->isSubClassOf("RegisterClass")) {

Decoder = "Decode" + Record->getName().str() + "RegisterClass"; Decoder = "Decode" + Record->getName().str() + "RegisterClass";

} else if (Record->isSubClassOf("PointerLikeRegClass")) { } else if (Record->isSubClassOf("PointerLikeRegClass")) {

Decoder = "DecodePointerLikeRegClass" + Decoder = "DecodePointerLikeRegClass" +

utostr(Record->getValueAsInt("RegClassKind")); utostr(Record->getValueAsInt("RegClassKind"));

} }

return Decoder; return Decoder;

} }

static bool OperandInfo getOpInfo(Record *TypeRecord) {

std::string Decoder = findOperandDecoderMethod(TypeRecord);

RecordVal *HasCompleteDecoderVal = TypeRecord->getValue("hasCompleteDecoder");

BitInit *HasCompleteDecoderBit =

HasCompleteDecoderVal

? dyn_cast<BitInit>(HasCompleteDecoderVal->getValue())

: nullptr;

bool HasCompleteDecoder =

HasCompleteDecoderBit ? HasCompleteDecoderBit->getValue() : true;

return OperandInfo(Decoder, HasCompleteDecoder);

}

void parseVarLenInstOperand(const Record &Def,

std::vector<OperandInfo> &Operands,

const CodeGenInstruction &CGI) {

const RecordVal *RV = Def.getValue("Inst");

VarLenInst VLI(cast<DagInit>(RV->getValue()), RV);

SmallVector<int> TiedTo;

myhsuUnsubmitted

Done

I'm not sure non-power-of-two # of (inlined) elements for SmallVector is recommended. If you're not sure what number to put, just use SmallVector<int> as recommended here.

myhsu: I'm not sure non-power-of-two # of (inlined) elements for SmallVector is recommended. If you're…

for (unsigned Idx = 0; Idx < CGI.Operands.size(); ++Idx) {

auto &Op = CGI.Operands[Idx];

if (Op.MIOperandInfo && Op.MIOperandInfo->getNumArgs() > 0)

for (auto *Arg : Op.MIOperandInfo->getArgs())

Operands.push_back(getOpInfo(cast<DefInit>(Arg)->getDef()));

else

myhsuUnsubmitted

Done

braces here can be removed

myhsu: braces here can be removed

myhsuUnsubmitted

Done

I meant only braces around the for-loop, your current syntax might create ambiguity. Here is the guideline: https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements

myhsu: I meant only braces around the for-loop, your current syntax might create ambiguity. Here is…

Operands.push_back(getOpInfo(Op.Rec));

int TiedReg = Op.getTiedRegister();

TiedTo.push_back(-1);

if (TiedReg != -1) {

TiedTo[Idx] = TiedReg;

TiedTo[TiedReg] = Idx;

}

unsigned CurrBitPos = 0;

for (auto &EncodingSegment : VLI) {

unsigned Offset = 0;

StringRef OpName;

if (const StringInit *SI = dyn_cast<StringInit>(EncodingSegment.Value)) {

OpName = SI->getValue();

} else if (const DagInit *DI = dyn_cast<DagInit>(EncodingSegment.Value)) {

OpName = cast<StringInit>(DI->getArg(0))->getValue();

Offset = cast<IntInit>(DI->getArg(2))->getValue();

}

if (!OpName.empty()) {

auto OpSubOpPair =

const_cast<CodeGenInstruction &>(CGI).Operands.ParseOperandName(

myhsuUnsubmitted

Done

please use !OpName.empty() or OpName.size()

myhsu: please use `!OpName.empty()` or `OpName.size()`

OpName);

unsigned OpIdx = CGI.Operands.getFlattenedOperandNumber(OpSubOpPair);

Operands[OpIdx].addField(CurrBitPos, EncodingSegment.BitWidth, Offset);

int TiedReg = TiedTo[OpSubOpPair.first];

if (TiedReg != -1) {

unsigned OpIdx = CGI.Operands.getFlattenedOperandNumber(

std::make_pair(TiedReg, OpSubOpPair.second));

Operands[OpIdx].addField(CurrBitPos, EncodingSegment.BitWidth, Offset);

}

CurrBitPos += EncodingSegment.BitWidth;

}

static unsigned

populateInstruction(CodeGenTarget &Target, const Record &EncodingDef, populateInstruction(CodeGenTarget &Target, const Record &EncodingDef,

const CodeGenInstruction &CGI, unsigned Opc, const CodeGenInstruction &CGI, unsigned Opc,

std::map<unsigned, std::vector<OperandInfo>> &Operands) { std::map<unsigned, std::vector<OperandInfo>> &Operands,

bool IsVarLenInst) {

const Record &Def = *CGI.TheDef; const Record &Def = *CGI.TheDef;

// If all the bit positions are not specified; do not decode this instruction. // If all the bit positions are not specified; do not decode this instruction.

// We are bound to fail! For proper disassembly, the well-known encoding bits // We are bound to fail! For proper disassembly, the well-known encoding bits

// of the instruction must be fully specified. // of the instruction must be fully specified.

BitsInit &Bits = getBitsField(EncodingDef, "Inst"); BitsInit &Bits = getBitsField(EncodingDef, "Inst");

if (Bits.allInComplete()) return false; if (Bits.allInComplete())

return 0;

std::vector<OperandInfo> InsnOperands; std::vector<OperandInfo> InsnOperands;

// If the instruction has specified a custom decoding hook, use that instead // If the instruction has specified a custom decoding hook, use that instead

// of trying to auto-generate the decoder. // of trying to auto-generate the decoder.

StringRef InstDecoder = EncodingDef.getValueAsString("DecoderMethod"); StringRef InstDecoder = EncodingDef.getValueAsString("DecoderMethod");

if (InstDecoder != "") { if (InstDecoder != "") {

bool HasCompleteInstDecoder = EncodingDef.getValueAsBit("hasCompleteDecoder"); bool HasCompleteInstDecoder = EncodingDef.getValueAsBit("hasCompleteDecoder");

InsnOperands.push_back( InsnOperands.push_back(

OperandInfo(std::string(InstDecoder), HasCompleteInstDecoder)); OperandInfo(std::string(InstDecoder), HasCompleteInstDecoder));

Operands[Opc] = InsnOperands; Operands[Opc] = InsnOperands;

return true; return Bits.getNumBits();

} }

// Generate a description of the operand of the instruction that we know // Generate a description of the operand of the instruction that we know

// how to decode automatically. // how to decode automatically.

// FIXME: We'll need to have a way to manually override this as needed. // FIXME: We'll need to have a way to manually override this as needed.

// Gather the outputs/inputs of the instruction, so we can find their // Gather the outputs/inputs of the instruction, so we can find their

// positions in the encoding. This assumes for now that they appear in the // positions in the encoding. This assumes for now that they appear in the

// MCInst in the order that they're listed. // MCInst in the order that they're listed.

std::vector<std::pair<Init*, StringRef>> InOutOperands; std::vector<std::pair<Init*, StringRef>> InOutOperands;

DagInit *Out = Def.getValueAsDag("OutOperandList"); DagInit *Out = Def.getValueAsDag("OutOperandList");

DagInit *In = Def.getValueAsDag("InOperandList"); DagInit *In = Def.getValueAsDag("InOperandList");

for (unsigned i = 0; i < Out->getNumArgs(); ++i) for (unsigned i = 0; i < Out->getNumArgs(); ++i)

InOutOperands.push_back(std::make_pair(Out->getArg(i), InOutOperands.push_back(

Out->getArgNameStr(i))); std::make_pair(Out->getArg(i), Out->getArgNameStr(i)));

myhsuUnsubmitted

Done

Did you use clang-format-diff.py rather than clang-format? I wonder why this line changed...

myhsu: Did you use `clang-format-diff.py` rather than clang-format? I wonder why this line changed...

0x59616eAuthorUnsubmitted

Done

I've tried, and it remains the same.

BTW I use "git clang-format". is there any difference between these two ?

0x59616e: I've tried, and it remains the same. BTW I use "git clang-format". is there any difference…

myhsuUnsubmitted

Not Done

I've tried, and it remains the same.

that's fine, just a minor issue

BTW I use "git clang-format". is there any difference between these two ?

I don't think so

myhsu: > I've tried, and it remains the same. that's fine, just a minor issue > BTW I use "git clang…

for (unsigned i = 0; i < In->getNumArgs(); ++i) for (unsigned i = 0; i < In->getNumArgs(); ++i)

InOutOperands.push_back(std::make_pair(In->getArg(i), InOutOperands.push_back(

In->getArgNameStr(i))); std::make_pair(In->getArg(i), In->getArgNameStr(i)));

// Search for tied operands, so that we can correctly instantiate // Search for tied operands, so that we can correctly instantiate

// operands that are not explicitly represented in the encoding. // operands that are not explicitly represented in the encoding.

std::map<std::string, std::string> TiedNames; std::map<std::string, std::string> TiedNames;

for (unsigned i = 0; i < CGI.Operands.size(); ++i) { for (unsigned i = 0; i < CGI.Operands.size(); ++i) {

int tiedTo = CGI.Operands[i].getTiedRegister(); int tiedTo = CGI.Operands[i].getTiedRegister();

if (tiedTo != -1) { if (tiedTo != -1) {

std::pair<unsigned, unsigned> SO = std::pair<unsigned, unsigned> SO =

CGI.Operands.getSubOperandNumber(tiedTo); CGI.Operands.getSubOperandNumber(tiedTo);

TiedNames[std::string(InOutOperands[i].second)] = TiedNames[std::string(InOutOperands[i].second)] =

std::string(InOutOperands[SO.first].second); std::string(InOutOperands[SO.first].second);

TiedNames[std::string(InOutOperands[SO.first].second)] = TiedNames[std::string(InOutOperands[SO.first].second)] =

std::string(InOutOperands[i].second); std::string(InOutOperands[i].second);

} }

if (IsVarLenInst) {

parseVarLenInstOperand(EncodingDef, InsnOperands, CGI);

} else {

std::map<std::string, std::vector<OperandInfo>> NumberedInsnOperands; std::map<std::string, std::vector<OperandInfo>> NumberedInsnOperands;

std::set<std::string> NumberedInsnOperandsNoTie; std::set<std::string> NumberedInsnOperandsNoTie;

if (Target.getInstructionSet()-> if (Target.getInstructionSet()->getValueAsBit(

getValueAsBit("decodePositionallyEncodedOperands")) { "decodePositionallyEncodedOperands")) {

const std::vector<RecordVal> &Vals = Def.getValues(); const std::vector<RecordVal> &Vals = Def.getValues();

unsigned NumberedOp = 0; unsigned NumberedOp = 0;

std::set<unsigned> NamedOpIndices; std::set<unsigned> NamedOpIndices;

if (Target.getInstructionSet()-> if (Target.getInstructionSet()->getValueAsBit(

getValueAsBit("noNamedPositionallyEncodedOperands")) "noNamedPositionallyEncodedOperands"))

// Collect the set of operand indices that might correspond to named // Collect the set of operand indices that might correspond to named

// operand, and skip these when assigning operands based on position. // operand, and skip these when assigning operands based on position.

for (unsigned i = 0, e = Vals.size(); i != e; ++i) { for (unsigned i = 0, e = Vals.size(); i != e; ++i) {

unsigned OpIdx; unsigned OpIdx;

if (!CGI.Operands.hasOperandNamed(Vals[i].getName(), OpIdx)) if (!CGI.Operands.hasOperandNamed(Vals[i].getName(), OpIdx))

continue; continue;

NamedOpIndices.insert(OpIdx); NamedOpIndices.insert(OpIdx);

} }

for (unsigned i = 0, e = Vals.size(); i != e; ++i) { for (unsigned i = 0, e = Vals.size(); i != e; ++i) {

// Ignore fixed fields in the record, we're looking for values like: // Ignore fixed fields in the record, we're looking for values like:

// bits<5> RST = { ?, ?, ?, ?, ? }; // bits<5> RST = { ?, ?, ?, ?, ? };

if (Vals[i].isNonconcreteOK() || Vals[i].getValue()->isComplete()) if (Vals[i].isNonconcreteOK() || Vals[i].getValue()->isComplete())

continue; continue;

// Determine if Vals[i] actually contributes to the Inst encoding. // Determine if Vals[i] actually contributes to the Inst encoding.

unsigned bi = 0; unsigned bi = 0;

for (; bi < Bits.getNumBits(); ++bi) { for (; bi < Bits.getNumBits(); ++bi) {

VarInit *Var = nullptr; VarInit *Var = nullptr;

VarBitInit *BI = dyn_cast<VarBitInit>(Bits.getBit(bi)); VarBitInit *BI = dyn_cast<VarBitInit>(Bits.getBit(bi));

if (BI) if (BI)

Var = dyn_cast<VarInit>(BI->getBitVar()); Var = dyn_cast<VarInit>(BI->getBitVar());

else else

Var = dyn_cast<VarInit>(Bits.getBit(bi)); Var = dyn_cast<VarInit>(Bits.getBit(bi));

if (Var && Var->getName() == Vals[i].getName()) if (Var && Var->getName() == Vals[i].getName())

break; break;

} }

if (bi == Bits.getNumBits()) if (bi == Bits.getNumBits())

continue; continue;

// Skip variables that correspond to explicitly-named operands. // Skip variables that correspond to explicitly-named operands.

unsigned OpIdx; unsigned OpIdx;

if (CGI.Operands.hasOperandNamed(Vals[i].getName(), OpIdx)) if (CGI.Operands.hasOperandNamed(Vals[i].getName(), OpIdx))

continue; continue;

// Get the bit range for this operand: // Get the bit range for this operand:

unsigned bitStart = bi++, bitWidth = 1; unsigned bitStart = bi++, bitWidth = 1;

for (; bi < Bits.getNumBits(); ++bi) { for (; bi < Bits.getNumBits(); ++bi) {

VarInit *Var = nullptr; VarInit *Var = nullptr;

VarBitInit *BI = dyn_cast<VarBitInit>(Bits.getBit(bi)); VarBitInit *BI = dyn_cast<VarBitInit>(Bits.getBit(bi));

if (BI) if (BI)

Var = dyn_cast<VarInit>(BI->getBitVar()); Var = dyn_cast<VarInit>(BI->getBitVar());

else else

Var = dyn_cast<VarInit>(Bits.getBit(bi)); Var = dyn_cast<VarInit>(Bits.getBit(bi));

if (!Var) if (!Var)

break; break;

if (Var->getName() != Vals[i].getName()) if (Var->getName() != Vals[i].getName())

break; break;

++bitWidth; ++bitWidth;

} }

unsigned NumberOps = CGI.Operands.size(); unsigned NumberOps = CGI.Operands.size();

while (NumberedOp < NumberOps && while (NumberedOp < NumberOps &&

(CGI.Operands.isFlatOperandNotEmitted(NumberedOp) || (CGI.Operands.isFlatOperandNotEmitted(NumberedOp) ||

(!NamedOpIndices.empty() && NamedOpIndices.count( (!NamedOpIndices.empty() &&

NamedOpIndices.count(

CGI.Operands.getSubOperandNumber(NumberedOp).first)))) CGI.Operands.getSubOperandNumber(NumberedOp).first))))

++NumberedOp; ++NumberedOp;

OpIdx = NumberedOp++; OpIdx = NumberedOp++;

// OpIdx now holds the ordered operand number of Vals[i]. // OpIdx now holds the ordered operand number of Vals[i].

std::pair<unsigned, unsigned> SO = std::pair<unsigned, unsigned> SO =

CGI.Operands.getSubOperandNumber(OpIdx); CGI.Operands.getSubOperandNumber(OpIdx);

const std::string &Name = CGI.Operands[SO.first].Name; const std::string &Name = CGI.Operands[SO.first].Name;

LLVM_DEBUG(dbgs() << "Numbered operand mapping for " << Def.getName() LLVM_DEBUG(dbgs() << "Numbered operand mapping for " << Def.getName()

<< ": " << Name << "(" << SO.first << ", " << SO.second << ": " << Name << "(" << SO.first << ", "

<< ") => " << Vals[i].getName() << "\n"); << SO.second << ") => " << Vals[i].getName() << "\n");

std::string Decoder; std::string Decoder;

Record *TypeRecord = CGI.Operands[SO.first].Rec; Record *TypeRecord = CGI.Operands[SO.first].Rec;

RecordVal *DecoderString = TypeRecord->getValue("DecoderMethod"); RecordVal *DecoderString = TypeRecord->getValue("DecoderMethod");

StringInit *String = DecoderString ? StringInit *String =

dyn_cast<StringInit>(DecoderString->getValue()) : nullptr; DecoderString ? dyn_cast<StringInit>(DecoderString->getValue())

: nullptr;

if (String && String->getValue() != "") if (String && String->getValue() != "")

Decoder = std::string(String->getValue()); Decoder = std::string(String->getValue());

if (Decoder == "" && if (Decoder == "" && CGI.Operands[SO.first].MIOperandInfo &&

CGI.Operands[SO.first].MIOperandInfo &&

CGI.Operands[SO.first].MIOperandInfo->getNumArgs()) { CGI.Operands[SO.first].MIOperandInfo->getNumArgs()) {

Init *Arg = CGI.Operands[SO.first].MIOperandInfo-> Init *Arg = CGI.Operands[SO.first].MIOperandInfo->getArg(SO.second);

getArg(SO.second);

if (DefInit *DI = cast<DefInit>(Arg)) if (DefInit *DI = cast<DefInit>(Arg))

TypeRecord = DI->getDef(); TypeRecord = DI->getDef();

} }

bool isReg = false; bool isReg = false;

if (TypeRecord->isSubClassOf("RegisterOperand")) if (TypeRecord->isSubClassOf("RegisterOperand"))

TypeRecord = TypeRecord->getValueAsDef("RegClass"); TypeRecord = TypeRecord->getValueAsDef("RegClass");

if (TypeRecord->isSubClassOf("RegisterClass")) { if (TypeRecord->isSubClassOf("RegisterClass")) {

Decoder = "Decode" + TypeRecord->getName().str() + "RegisterClass"; Decoder = "Decode" + TypeRecord->getName().str() + "RegisterClass";

isReg = true; isReg = true;

} else if (TypeRecord->isSubClassOf("PointerLikeRegClass")) { } else if (TypeRecord->isSubClassOf("PointerLikeRegClass")) {

Decoder = "DecodePointerLikeRegClass" + Decoder = "DecodePointerLikeRegClass" +

utostr(TypeRecord->getValueAsInt("RegClassKind")); utostr(TypeRecord->getValueAsInt("RegClassKind"));

isReg = true; isReg = true;

} }

DecoderString = TypeRecord->getValue("DecoderMethod"); DecoderString = TypeRecord->getValue("DecoderMethod");

String = DecoderString ? String = DecoderString ? dyn_cast<StringInit>(DecoderString->getValue())

dyn_cast<StringInit>(DecoderString->getValue()) : nullptr; : nullptr;

if (!isReg && String && String->getValue() != "") if (!isReg && String && String->getValue() != "")

Decoder = std::string(String->getValue()); Decoder = std::string(String->getValue());

RecordVal *HasCompleteDecoderVal = RecordVal *HasCompleteDecoderVal =

TypeRecord->getValue("hasCompleteDecoder"); TypeRecord->getValue("hasCompleteDecoder");

BitInit *HasCompleteDecoderBit = HasCompleteDecoderVal ? BitInit *HasCompleteDecoderBit =

dyn_cast<BitInit>(HasCompleteDecoderVal->getValue()) : nullptr; HasCompleteDecoderVal

bool HasCompleteDecoder = HasCompleteDecoderBit ? ? dyn_cast<BitInit>(HasCompleteDecoderVal->getValue())

HasCompleteDecoderBit->getValue() : true; : nullptr;

bool HasCompleteDecoder =

HasCompleteDecoderBit ? HasCompleteDecoderBit->getValue() : true;

OperandInfo OpInfo(Decoder, HasCompleteDecoder); OperandInfo OpInfo(Decoder, HasCompleteDecoder);

OpInfo.addField(bitStart, bitWidth, 0); OpInfo.addField(bitStart, bitWidth, 0);

NumberedInsnOperands[Name].push_back(OpInfo); NumberedInsnOperands[Name].push_back(OpInfo);

// FIXME: For complex operands with custom decoders we can't handle tied // FIXME: For complex operands with custom decoders we can't handle tied

// sub-operands automatically. Skip those here and assume that this is // sub-operands automatically. Skip those here and assume that this is

// fixed up elsewhere. // fixed up elsewhere.

if (CGI.Operands[SO.first].MIOperandInfo && if (CGI.Operands[SO.first].MIOperandInfo &&

CGI.Operands[SO.first].MIOperandInfo->getNumArgs() > 1 && CGI.Operands[SO.first].MIOperandInfo->getNumArgs() > 1 && String &&

String && String->getValue() != "") String->getValue() != "")

NumberedInsnOperandsNoTie.insert(Name); NumberedInsnOperandsNoTie.insert(Name);

} }

// For each operand, see if we can figure out where it is encoded. // For each operand, see if we can figure out where it is encoded.

for (const auto &Op : InOutOperands) { for (const auto &Op : InOutOperands) {

if (!NumberedInsnOperands[std::string(Op.second)].empty()) { if (!NumberedInsnOperands[std::string(Op.second)].empty()) {

llvm::append_range(InsnOperands, llvm::append_range(InsnOperands,

NumberedInsnOperands[std::string(Op.second)]); NumberedInsnOperands[std::string(Op.second)]);

continue; continue;

} }

if (!NumberedInsnOperands[TiedNames[std::string(Op.second)]].empty()) { if (!NumberedInsnOperands[TiedNames[std::string(Op.second)]].empty()) {

if (!NumberedInsnOperandsNoTie.count(TiedNames[std::string(Op.second)])) { if (!NumberedInsnOperandsNoTie.count(

TiedNames[std::string(Op.second)])) {

// Figure out to which (sub)operand we're tied. // Figure out to which (sub)operand we're tied.

unsigned i = unsigned i =

CGI.Operands.getOperandNamed(TiedNames[std::string(Op.second)]); CGI.Operands.getOperandNamed(TiedNames[std::string(Op.second)]);

int tiedTo = CGI.Operands[i].getTiedRegister(); int tiedTo = CGI.Operands[i].getTiedRegister();

if (tiedTo == -1) { if (tiedTo == -1) {

i = CGI.Operands.getOperandNamed(Op.second); i = CGI.Operands.getOperandNamed(Op.second);

tiedTo = CGI.Operands[i].getTiedRegister(); tiedTo = CGI.Operands[i].getTiedRegister();

} }

if (tiedTo != -1) { if (tiedTo != -1) {

std::pair<unsigned, unsigned> SO = std::pair<unsigned, unsigned> SO =

CGI.Operands.getSubOperandNumber(tiedTo); CGI.Operands.getSubOperandNumber(tiedTo);

InsnOperands.push_back( InsnOperands.push_back(

NumberedInsnOperands[TiedNames[std::string(Op.second)]] NumberedInsnOperands[TiedNames[std::string(Op.second)]]

[SO.second]); [SO.second]);

} }

continue; continue;

} }

TypedInit *TI = cast<TypedInit>(Op.first);

// At this point, we can locate the decoder field, but we need to know how // At this point, we can locate the decoder field, but we need to know how

// to interpret it. As a first step, require the target to provide // to interpret it. As a first step, require the target to provide

// callbacks for decoding register classes. // callbacks for decoding register classes.

std::string Decoder = findOperandDecoderMethod(TI);

Record *TypeRecord = cast<DefInit>(TI)->getDef();

RecordVal *HasCompleteDecoderVal =

TypeRecord->getValue("hasCompleteDecoder");

BitInit *HasCompleteDecoderBit = HasCompleteDecoderVal ?

dyn_cast<BitInit>(HasCompleteDecoderVal->getValue()) : nullptr;

bool HasCompleteDecoder = HasCompleteDecoderBit ?

HasCompleteDecoderBit->getValue() : true;

OperandInfo OpInfo(Decoder, HasCompleteDecoder); OperandInfo OpInfo = getOpInfo(cast<DefInit>(Op.first)->getDef());

// Some bits of the operand may be required to be 1 depending on the // Some bits of the operand may be required to be 1 depending on the

// instruction's encoding. Collect those bits. // instruction's encoding. Collect those bits.

if (const RecordVal *EncodedValue = EncodingDef.getValue(Op.second)) if (const RecordVal *EncodedValue = EncodingDef.getValue(Op.second))

if (const BitsInit *OpBits = dyn_cast<BitsInit>(EncodedValue->getValue())) if (const BitsInit *OpBits =

dyn_cast<BitsInit>(EncodedValue->getValue()))

for (unsigned I = 0; I < OpBits->getNumBits(); ++I) for (unsigned I = 0; I < OpBits->getNumBits(); ++I)

if (const BitInit *OpBit = dyn_cast<BitInit>(OpBits->getBit(I))) if (const BitInit *OpBit = dyn_cast<BitInit>(OpBits->getBit(I)))

if (OpBit->getValue()) if (OpBit->getValue())

OpInfo.InitValue |= 1ULL << I; OpInfo.InitValue |= 1ULL << I;

unsigned Base = ~0U; unsigned Base = ~0U;

unsigned Width = 0; unsigned Width = 0;

unsigned Offset = 0; unsigned Offset = 0;

for (unsigned bi = 0; bi < Bits.getNumBits(); ++bi) { for (unsigned bi = 0; bi < Bits.getNumBits(); ++bi) {

VarInit *Var = nullptr; VarInit *Var = nullptr;

VarBitInit *BI = dyn_cast<VarBitInit>(Bits.getBit(bi)); VarBitInit *BI = dyn_cast<VarBitInit>(Bits.getBit(bi));

if (BI) if (BI)

Var = dyn_cast<VarInit>(BI->getBitVar()); Var = dyn_cast<VarInit>(BI->getBitVar());

else else

Var = dyn_cast<VarInit>(Bits.getBit(bi)); Var = dyn_cast<VarInit>(Bits.getBit(bi));

myhsuUnsubmitted

Done

The OI argument can be OperandInfo & to avoid copy.

myhsu: The `OI` argument can be `OperandInfo &` to avoid copy.

if (!Var) { if (!Var) {

if (Base != ~0U) { if (Base != ~0U) {

OpInfo.addField(Base, Width, Offset); OpInfo.addField(Base, Width, Offset);

Base = ~0U; Base = ~0U;

Width = 0; Width = 0;

Offset = 0; Offset = 0;

} }

continue; continue;

} }

if (Var->getName() != Op.second && if ((Var->getName() != Op.second &&

Var->getName() != TiedNames[std::string(Op.second)]) { Var->getName() != TiedNames[std::string(Op.second)])) {

if (Base != ~0U) { if (Base != ~0U) {

OpInfo.addField(Base, Width, Offset); OpInfo.addField(Base, Width, Offset);

Base = ~0U; Base = ~0U;

Width = 0; Width = 0;

Offset = 0; Offset = 0;

} }

continue; continue;

} }

if (Base == ~0U) { if (Base == ~0U) {

Base = bi; Base = bi;

Width = 1; Width = 1;

Offset = BI ? BI->getBitNum() : 0; Offset = BI ? BI->getBitNum() : 0;

} else if (BI && BI->getBitNum() != Offset + Width) { } else if (BI && BI->getBitNum() != Offset + Width) {

OpInfo.addField(Base, Width, Offset); OpInfo.addField(Base, Width, Offset);

Base = bi; Base = bi;

Width = 1; Width = 1;

Offset = BI->getBitNum(); Offset = BI->getBitNum();

} else { } else {

++Width; ++Width;

} }

if (Base != ~0U) if (Base != ~0U)

OpInfo.addField(Base, Width, Offset); OpInfo.addField(Base, Width, Offset);

if (OpInfo.numFields() > 0) if (OpInfo.numFields() > 0)

InsnOperands.push_back(OpInfo); InsnOperands.push_back(OpInfo);

} }

}

Operands[Opc] = InsnOperands; Operands[Opc] = InsnOperands;

#if 0 #if 0

LLVM_DEBUG({ LLVM_DEBUG({

// Dumps the instruction encoding bits. // Dumps the instruction encoding bits.

dumpBits(errs(), Bits); dumpBits(errs(), Bits);

errs() << '\n'; errs() << '\n';

// Dumps the list of operand info. // Dumps the list of operand info.

for (unsigned i = 0, e = CGI.Operands.size(); i != e; ++i) { for (unsigned i = 0, e = CGI.Operands.size(); i != e; ++i) {

const CGIOperandList::OperandInfo &Info = CGI.Operands[i]; const CGIOperandList::OperandInfo &Info = CGI.Operands[i];

const std::string &OperandName = Info.Name; const std::string &OperandName = Info.Name;

const Record &OperandDef = *Info.Rec; const Record &OperandDef = *Info.Rec;

errs() << "\t" << OperandName << " (" << OperandDef.getName() << ")\n"; errs() << "\t" << OperandName << " (" << OperandDef.getName() << ")\n";

} }

}); });

#endif #endif

return true; return Bits.getNumBits();

} }

// emitFieldFromInstruction - Emit the templated helper function // emitFieldFromInstruction - Emit the templated helper function

// fieldFromInstruction(). // fieldFromInstruction().

// On Windows we make sure that this function is not inlined when // On Windows we make sure that this function is not inlined when

// using the VS compiler. It has a bug which causes the function // using the VS compiler. It has a bug which causes the function

// to be optimized out in some circustances. See llvm.org/pr38292 // to be optimized out in some circustances. See llvm.org/pr38292

static void emitFieldFromInstruction(formatted_raw_ostream &OS) { static void emitFieldFromInstruction(formatted_raw_ostream &OS) {

Show All 29 Lines OS << "// Helper functions for extracting fields from encoded instructions.\n"

<< "}\n" << "}\n"

<< "\n" << "\n"

<< "template <typename InsnType>\n" << "template <typename InsnType>\n"

<< "static std::enable_if_t<!std::is_integral<InsnType>::value, " << "static std::enable_if_t<!std::is_integral<InsnType>::value, "

"uint64_t>\n" "uint64_t>\n"

<< "fieldFromInstruction(const InsnType &insn, unsigned startBit,\n" << "fieldFromInstruction(const InsnType &insn, unsigned startBit,\n"

<< " unsigned numBits) {\n" << " unsigned numBits) {\n"

<< " return insn.extractBitsAsZExtValue(numBits, startBit);\n" << " return insn.extractBitsAsZExtValue(numBits, startBit);\n"

<< "}\n\n"; << "}\n\n";

myhsuUnsubmitted

Done

please use llvm::function_ref here

myhsu: please use `llvm::function_ref` here

} }

// emitInsertBits - Emit the templated helper function insertBits(). // emitInsertBits - Emit the templated helper function insertBits().

myhsuUnsubmitted

Done

why do we want to enlarge insn (on-demand) upon every field extractions? Can we resize insn ahead of time and do it only once?

myhsu: why do we want to enlarge `insn` (on-demand) upon every field extractions? Can we resize `insn`…

0x59616eAuthorUnsubmitted

Done

In some cases, for example, case MCD::OPC_CheckField, needs to extract the bit field before we know the length of the instruction.

But only two cases need this IIRC: OPC_ExtractField and OPC_CheckField. Doing this in fieldFromInstruction is because I'm lazy. I will fix this in the next diff.

And, can we resize insn ahead of time and do it only once ? I think it's no. We have to know the length to resize insn. So we can do this only in case MCD::OPC_Decode.

0x59616e: In some cases, for example, case `MCD::OPC_CheckField`, needs to extract the bit field before…

static void emitInsertBits(formatted_raw_ostream &OS) { static void emitInsertBits(formatted_raw_ostream &OS) {

OS << "// Helper function for inserting bits extracted from an encoded " OS << "// Helper function for inserting bits extracted from an encoded "

"instruction into\n" "instruction into\n"

<< "// a field.\n" << "// a field.\n"

<< "template <typename InsnType>\n" << "template <typename InsnType>\n"

<< "static std::enable_if_t<std::is_integral<InsnType>::value>\n" << "static std::enable_if_t<std::is_integral<InsnType>::value>\n"

<< "insertBits(InsnType &field, InsnType bits, unsigned startBit, " << "insertBits(InsnType &field, InsnType bits, unsigned startBit, "

"unsigned numBits) {\n" "unsigned numBits) {\n"

<< " assert(startBit + numBits <= sizeof field * 8);\n" << " assert(startBit + numBits <= sizeof field * 8);\n"

<< " field |= (InsnType)bits << startBit;\n" << " field |= (InsnType)bits << startBit;\n"

<< "}\n" << "}\n"

<< "\n" << "\n"

<< "template <typename InsnType>\n" << "template <typename InsnType>\n"

<< "static std::enable_if_t<!std::is_integral<InsnType>::value>\n" << "static std::enable_if_t<!std::is_integral<InsnType>::value>\n"

<< "insertBits(InsnType &field, uint64_t bits, unsigned startBit, " << "insertBits(InsnType &field, uint64_t bits, unsigned startBit, "

"unsigned numBits) {\n" "unsigned numBits) {\n"

<< " field.insertBits(bits, startBit, numBits);\n" << " field.insertBits(bits, startBit, numBits);\n"

<< "}\n\n"; << "}\n\n";

} }

// emitDecodeInstruction - Emit the templated helper function // emitDecodeInstruction - Emit the templated helper function

// decodeInstruction(). // decodeInstruction().

static void emitDecodeInstruction(formatted_raw_ostream &OS) { static void emitDecodeInstruction(formatted_raw_ostream &OS,

bool IsVarLenInst) {

OS << "template <typename InsnType>\n" OS << "template <typename InsnType>\n"

<< "static DecodeStatus decodeInstruction(const uint8_t DecodeTable[], " << "static DecodeStatus decodeInstruction(const uint8_t DecodeTable[], "

"MCInst &MI,\n" "MCInst &MI,\n"

<< " InsnType insn, uint64_t " << " InsnType insn, uint64_t "

"Address,\n" "Address,\n"

<< " const MCDisassembler *DisAsm,\n" << " const MCDisassembler *DisAsm,\n"

<< " const MCSubtargetInfo &STI) {\n" << " const MCSubtargetInfo &STI";

if (IsVarLenInst) {

OS << ",\n"

<< " llvm::function_ref<void(APInt "

myhsuUnsubmitted

Done

ditto

myhsu: ditto

"&,"

<< " uint64_t)> makeUp";

}

OS << ") {\n"

<< " const FeatureBitset &Bits = STI.getFeatureBits();\n" << " const FeatureBitset &Bits = STI.getFeatureBits();\n"

<< "\n" << "\n"

<< " const uint8_t *Ptr = DecodeTable;\n" << " const uint8_t *Ptr = DecodeTable;\n"

<< " InsnType CurFieldValue = 0;\n" << " uint64_t CurFieldValue = 0;\n"

<< " DecodeStatus S = MCDisassembler::Success;\n" << " DecodeStatus S = MCDisassembler::Success;\n"

<< " while (true) {\n" << " while (true) {\n"

<< " ptrdiff_t Loc = Ptr - DecodeTable;\n" << " ptrdiff_t Loc = Ptr - DecodeTable;\n"

<< " switch (*Ptr) {\n" << " switch (*Ptr) {\n"

<< " default:\n" << " default:\n"

<< " errs() << Loc << \": Unexpected decode table opcode!\\n\";\n" << " errs() << Loc << \": Unexpected decode table opcode!\\n\";\n"

<< " return MCDisassembler::Fail;\n" << " return MCDisassembler::Fail;\n"

<< " case MCD::OPC_ExtractField: {\n" << " case MCD::OPC_ExtractField: {\n"

<< " unsigned Start = *++Ptr;\n" << " unsigned Start = *++Ptr;\n"

<< " unsigned Len = *++Ptr;\n" << " unsigned Len = *++Ptr;\n"

<< " ++Ptr;\n" << " ++Ptr;\n";

<< " CurFieldValue = fieldFromInstruction(insn, Start, Len);\n" if (IsVarLenInst)

OS << " makeUp(insn, Start + Len);\n";

OS << " CurFieldValue = fieldFromInstruction(insn, Start, Len);\n"

<< " LLVM_DEBUG(dbgs() << Loc << \": OPC_ExtractField(\" << Start << " << " LLVM_DEBUG(dbgs() << Loc << \": OPC_ExtractField(\" << Start << "

"\", \"\n" "\", \"\n"

<< " << Len << \"): \" << CurFieldValue << \"\\n\");\n" << " << Len << \"): \" << CurFieldValue << \"\\n\");\n"

<< " break;\n" << " break;\n"

<< " }\n" << " }\n"

<< " case MCD::OPC_FilterValue: {\n" << " case MCD::OPC_FilterValue: {\n"

<< " // Decode the field value.\n" << " // Decode the field value.\n"

<< " unsigned Len;\n" << " unsigned Len;\n"

<< " InsnType Val = decodeULEB128(++Ptr, &Len);\n" << " uint64_t Val = decodeULEB128(++Ptr, &Len);\n"

myhsuUnsubmitted

Done

why change to uint64_t? If you're worry about InsnType being APInt, APInt has operator=(uint64_t)

myhsu: why change to uint64_t? If you're worry about InsnType being APInt, APInt has `operator=…

0x59616eAuthorUnsubmitted

Done

But it yells "conversion from ‘uint64_t’ {aka ‘long unsigned int’} to non-scalar type ‘llvm::APInt’ requested"

0x59616e: But it yells "conversion from ‘uint64_t’ {aka ‘long unsigned int’} to non-scalar type ‘llvm…

myhsuUnsubmitted

Done

you're right and I was wrong about APInt::operator=(uint64_t): the operator won't be used in the case of initialization (constructor will be used instead but APInt doesn't have APInt(uint64_t)).

myhsu: you're right and I was wrong about `APInt::operator=(uint64_t)`: the operator won't be used in…

0x59616eAuthorUnsubmitted

Done

C++ is hard.

0x59616e: C++ is hard.

foadUnsubmitted

Not Done

Hi, this is causing a slight problem for us downstream because we're using a custom InsnType (not APInt). Have you seen the big comment emitted at line 2255? It explicitly says that your InsnType needs to "be constructible from a uint64_t". So please either update the comment, or stop using plain APInt as your InsnType :)

foad: Hi, this is causing a slight problem for us downstream because we're using a custom InsnType…

<< " Ptr += Len;\n" << " Ptr += Len;\n"

<< " // NumToSkip is a plain 24-bit integer.\n" << " // NumToSkip is a plain 24-bit integer.\n"

<< " unsigned NumToSkip = *Ptr++;\n" << " unsigned NumToSkip = *Ptr++;\n"

<< " NumToSkip |= (*Ptr++) << 8;\n" << " NumToSkip |= (*Ptr++) << 8;\n"

<< " NumToSkip |= (*Ptr++) << 16;\n" << " NumToSkip |= (*Ptr++) << 16;\n"

<< "\n" << "\n"

<< " // Perform the filter operation.\n" << " // Perform the filter operation.\n"

<< " if (Val != CurFieldValue)\n" << " if (Val != CurFieldValue)\n"

<< " Ptr += NumToSkip;\n" << " Ptr += NumToSkip;\n"

<< " LLVM_DEBUG(dbgs() << Loc << \": OPC_FilterValue(\" << Val << " << " LLVM_DEBUG(dbgs() << Loc << \": OPC_FilterValue(\" << Val << "

"\", \" << NumToSkip\n" "\", \" << NumToSkip\n"

<< " << \"): \" << ((Val != CurFieldValue) ? \"FAIL:\" " << " << \"): \" << ((Val != CurFieldValue) ? \"FAIL:\" "

": \"PASS:\")\n" ": \"PASS:\")\n"

<< " << \" continuing at \" << (Ptr - DecodeTable) << " << " << \" continuing at \" << (Ptr - DecodeTable) << "

"\"\\n\");\n" "\"\\n\");\n"

<< "\n" << "\n"

<< " break;\n" << " break;\n"

<< " }\n" << " }\n"

<< " case MCD::OPC_CheckField: {\n" << " case MCD::OPC_CheckField: {\n"

<< " unsigned Start = *++Ptr;\n" << " unsigned Start = *++Ptr;\n"

<< " unsigned Len = *++Ptr;\n" << " unsigned Len = *++Ptr;\n";

<< " InsnType FieldValue = fieldFromInstruction(insn, Start, Len);\n" if (IsVarLenInst)

OS << " makeUp(insn, Start + Len);\n";

OS << " uint64_t FieldValue = fieldFromInstruction(insn, Start, Len);\n"

<< " // Decode the field value.\n" << " // Decode the field value.\n"

<< " InsnType ExpectedValue = decodeULEB128(++Ptr, &Len);\n" << " unsigned PtrLen = 0;\n"

<< " Ptr += Len;\n" << " uint64_t ExpectedValue = decodeULEB128(++Ptr, &PtrLen);\n"

myhsuUnsubmitted

Done

ditto

myhsu: ditto

myhsuUnsubmitted

Done

why did you use another PtrLen variable here?

myhsu: why did you use another PtrLen variable here?

0x59616eAuthorUnsubmitted

Done

quick answer: for debug purpose.

In the original code, Len is the length of the instruction field. But it is changed in the function `decodeULEB128'. That will break the debug message below.

0x59616e: quick answer: for debug purpose. In the original code, `Len` is the length of the instruction…

<< " Ptr += PtrLen;\n"

<< " // NumToSkip is a plain 24-bit integer.\n" << " // NumToSkip is a plain 24-bit integer.\n"

<< " unsigned NumToSkip = *Ptr++;\n" << " unsigned NumToSkip = *Ptr++;\n"

<< " NumToSkip |= (*Ptr++) << 8;\n" << " NumToSkip |= (*Ptr++) << 8;\n"

<< " NumToSkip |= (*Ptr++) << 16;\n" << " NumToSkip |= (*Ptr++) << 16;\n"

<< "\n" << "\n"

<< " // If the actual and expected values don't match, skip.\n" << " // If the actual and expected values don't match, skip.\n"

<< " if (ExpectedValue != FieldValue)\n" << " if (ExpectedValue != FieldValue)\n"

<< " Ptr += NumToSkip;\n" << " Ptr += NumToSkip;\n"

Show All 33 Lines OS << " uint64_t FieldValue = fieldFromInstruction(insn, Start, Len);\n"

<< " // Decode the Opcode value.\n" << " // Decode the Opcode value.\n"

<< " unsigned Opc = decodeULEB128(++Ptr, &Len);\n" << " unsigned Opc = decodeULEB128(++Ptr, &Len);\n"

<< " Ptr += Len;\n" << " Ptr += Len;\n"

<< " unsigned DecodeIdx = decodeULEB128(Ptr, &Len);\n" << " unsigned DecodeIdx = decodeULEB128(Ptr, &Len);\n"

<< " Ptr += Len;\n" << " Ptr += Len;\n"

<< "\n" << "\n"

<< " MI.clear();\n" << " MI.clear();\n"

<< " MI.setOpcode(Opc);\n" << " MI.setOpcode(Opc);\n"

<< " bool DecodeComplete;\n" << " bool DecodeComplete;\n";

<< " S = decodeToMCInst(S, DecodeIdx, insn, MI, Address, DisAsm, " if (IsVarLenInst) {

OS << " Len = InstrLenTable[Opc];\n"

<< " makeUp(insn, Len);\n";

}

OS << " S = decodeToMCInst(S, DecodeIdx, insn, MI, Address, DisAsm, "

"DecodeComplete);\n" "DecodeComplete);\n"

<< " assert(DecodeComplete);\n" << " assert(DecodeComplete);\n"

<< "\n" << "\n"

<< " LLVM_DEBUG(dbgs() << Loc << \": OPC_Decode: opcode \" << Opc\n" << " LLVM_DEBUG(dbgs() << Loc << \": OPC_Decode: opcode \" << Opc\n"

<< " << \", using decoder \" << DecodeIdx << \": \"\n" << " << \", using decoder \" << DecodeIdx << \": \"\n"

<< " << (S != MCDisassembler::Fail ? \"PASS\" : " << " << (S != MCDisassembler::Fail ? \"PASS\" : "

"\"FAIL\") << \"\\n\");\n" "\"FAIL\") << \"\\n\");\n"

<< " return S;\n" << " return S;\n"

Show All 37 Lines OS << " S = decodeToMCInst(S, DecodeIdx, insn, MI, Address, DisAsm, "

<< " // set before the decode attempt.\n" << " // set before the decode attempt.\n"

<< " S = MCDisassembler::Success;\n" << " S = MCDisassembler::Success;\n"

<< " }\n" << " }\n"

<< " break;\n" << " break;\n"

<< " }\n" << " }\n"

<< " case MCD::OPC_SoftFail: {\n" << " case MCD::OPC_SoftFail: {\n"

<< " // Decode the mask values.\n" << " // Decode the mask values.\n"

<< " unsigned Len;\n" << " unsigned Len;\n"

<< " InsnType PositiveMask = decodeULEB128(++Ptr, &Len);\n" << " uint64_t PositiveMask = decodeULEB128(++Ptr, &Len);\n"

myhsuUnsubmitted

Done

ditto

myhsu: ditto

<< " Ptr += Len;\n" << " Ptr += Len;\n"

<< " InsnType NegativeMask = decodeULEB128(Ptr, &Len);\n" << " uint64_t NegativeMask = decodeULEB128(Ptr, &Len);\n"

<< " Ptr += Len;\n" << " Ptr += Len;\n"

<< " bool Fail = (insn & PositiveMask) || (~insn & NegativeMask);\n" << " bool Fail = (insn & PositiveMask) != 0 || (~insn & "

"NegativeMask) != 0;\n"

myhsuUnsubmitted

Done

"!= 0" in these two lines are redundant.

myhsu: "!= 0" in these two lines are redundant.

0x59616eAuthorUnsubmitted

Done

g++ screams "no match for ‘operator||’ (operand types are ‘llvm::APInt’ and ‘llvm::APInt’)"

0x59616e: g++ screams "no match for ‘operator||’ (operand types are ‘llvm::APInt’ and ‘llvm::APInt’)"

foadUnsubmitted

Not Done

Downstream the compiler complained about using mismatched types for operator&, InsnType and uint64_t. So please either change this back or at least update the comment on line 2255.

foad: Downstream the compiler complained about using mismatched types for operator&, InsnType and…

<< " if (Fail)\n" << " if (Fail)\n"

<< " S = MCDisassembler::SoftFail;\n" << " S = MCDisassembler::SoftFail;\n"

<< " LLVM_DEBUG(dbgs() << Loc << \": OPC_SoftFail: \" << (Fail ? " << " LLVM_DEBUG(dbgs() << Loc << \": OPC_SoftFail: \" << (Fail ? "

"\"FAIL\\n\" : \"PASS\\n\"));\n" "\"FAIL\\n\" : \"PASS\\n\"));\n"

<< " break;\n" << " break;\n"

<< " }\n" << " }\n"

<< " case MCD::OPC_Fail: {\n" << " case MCD::OPC_Fail: {\n"

<< " LLVM_DEBUG(dbgs() << Loc << \": OPC_Fail\\n\");\n" << " LLVM_DEBUG(dbgs() << Loc << \": OPC_Fail\\n\");\n"

Show All 22 Lines void FixedLenDecoderEmitter::run(raw_ostream &o) {

emitFieldFromInstruction(OS); emitFieldFromInstruction(OS);

emitInsertBits(OS); emitInsertBits(OS);

Target.reverseBitsForLittleEndianEncoding(); Target.reverseBitsForLittleEndianEncoding();

// Parameterize the decoders based on namespace and instruction width. // Parameterize the decoders based on namespace and instruction width.

std::set<StringRef> HwModeNames; std::set<StringRef> HwModeNames;

const auto &NumberedInstructions = Target.getInstructionsByEnumValue(); const auto &NumberedInstructions = Target.getInstructionsByEnumValue();

myhsuUnsubmitted

Done

why remove '&'?

myhsu: why remove '&'?

NumberedEncodings.reserve(NumberedInstructions.size()); NumberedEncodings.reserve(NumberedInstructions.size());

DenseMap<Record *, unsigned> IndexOfInstruction; DenseMap<Record *, unsigned> IndexOfInstruction;

// First, collect all HwModes referenced by the target. // First, collect all HwModes referenced by the target.

for (const auto &NumberedInstruction : NumberedInstructions) { for (const auto &NumberedInstruction : NumberedInstructions) {

IndexOfInstruction[NumberedInstruction->TheDef] = NumberedEncodings.size(); IndexOfInstruction[NumberedInstruction->TheDef] = NumberedEncodings.size();

if (const RecordVal *RV = if (const RecordVal *RV =

NumberedInstruction->TheDef->getValue("EncodingInfos")) { NumberedInstruction->TheDef->getValue("EncodingInfos")) {

Show All 35 Lines void FixedLenDecoderEmitter::run(raw_ostream &o) {

for (const auto &NumberedAlias : RK.getAllDerivedDefinitions("AdditionalEncoding")) for (const auto &NumberedAlias : RK.getAllDerivedDefinitions("AdditionalEncoding"))

NumberedEncodings.emplace_back( NumberedEncodings.emplace_back(

NumberedAlias, NumberedAlias,

&Target.getInstruction(NumberedAlias->getValueAsDef("AliasOf"))); &Target.getInstruction(NumberedAlias->getValueAsDef("AliasOf")));

std::map<std::pair<std::string, unsigned>, std::vector<EncodingIDAndOpcode>> std::map<std::pair<std::string, unsigned>, std::vector<EncodingIDAndOpcode>>

OpcMap; OpcMap;

std::map<unsigned, std::vector<OperandInfo>> Operands; std::map<unsigned, std::vector<OperandInfo>> Operands;

std::vector<unsigned> InstrLen;

bool IsVarLenInst =

any_of(NumberedInstructions, [](const CodeGenInstruction *CGI) {

RecordVal *RV = CGI->TheDef->getValue("Inst");

return RV && isa<DagInit>(RV->getValue());

});

unsigned MaxInstLen = 0;

for (unsigned i = 0; i < NumberedEncodings.size(); ++i) { for (unsigned i = 0; i < NumberedEncodings.size(); ++i) {

const Record *EncodingDef = NumberedEncodings[i].EncodingDef; const Record *EncodingDef = NumberedEncodings[i].EncodingDef;

const CodeGenInstruction *Inst = NumberedEncodings[i].Inst; const CodeGenInstruction *Inst = NumberedEncodings[i].Inst;

const Record *Def = Inst->TheDef; const Record *Def = Inst->TheDef;

unsigned Size = EncodingDef->getValueAsInt("Size"); unsigned Size = EncodingDef->getValueAsInt("Size");

if (Def->getValueAsString("Namespace") == "TargetOpcode" || if (Def->getValueAsString("Namespace") == "TargetOpcode" ||

Def->getValueAsBit("isPseudo") || Def->getValueAsBit("isPseudo") ||

Def->getValueAsBit("isAsmParserOnly") || Def->getValueAsBit("isAsmParserOnly") ||

Def->getValueAsBit("isCodeGenOnly")) { Def->getValueAsBit("isCodeGenOnly")) {

NumEncodingsLackingDisasm++; NumEncodingsLackingDisasm++;

continue; continue;

} }

if (i < NumberedInstructions.size()) if (i < NumberedInstructions.size())

NumInstructions++; NumInstructions++;

NumEncodings++; NumEncodings++;

if (!Size) if (!Size && !IsVarLenInst)

continue; continue;

if (populateInstruction(Target, *EncodingDef, *Inst, i, Operands)) { if (IsVarLenInst)

InstrLen.resize(NumberedInstructions.size(), 0);

if (unsigned Len = populateInstruction(Target, *EncodingDef, *Inst, i,

Operands, IsVarLenInst)) {

if (IsVarLenInst) {

MaxInstLen = std::max(MaxInstLen, Len);

InstrLen[i] = Len;

}

std::string DecoderNamespace = std::string DecoderNamespace =

std::string(EncodingDef->getValueAsString("DecoderNamespace")); std::string(EncodingDef->getValueAsString("DecoderNamespace"));

if (!NumberedEncodings[i].HwModeName.empty()) if (!NumberedEncodings[i].HwModeName.empty())

DecoderNamespace += DecoderNamespace +=

std::string("_") + NumberedEncodings[i].HwModeName.str(); std::string("_") + NumberedEncodings[i].HwModeName.str();

OpcMap[std::make_pair(DecoderNamespace, Size)].emplace_back( OpcMap[std::make_pair(DecoderNamespace, Size)].emplace_back(

i, IndexOfInstruction.find(Def)->second); i, IndexOfInstruction.find(Def)->second);

} else { } else {

NumEncodingsOmitted++; NumEncodingsOmitted++;

} }

DecoderTableInfo TableInfo; DecoderTableInfo TableInfo;

for (const auto &Opc : OpcMap) { for (const auto &Opc : OpcMap) {

// Emit the decoder for this namespace+width combination. // Emit the decoder for this namespace+width combination.

ArrayRef<EncodingAndInst> NumberedEncodingsRef( ArrayRef<EncodingAndInst> NumberedEncodingsRef(

NumberedEncodings.data(), NumberedEncodings.size()); NumberedEncodings.data(), NumberedEncodings.size());

FilterChooser FC(NumberedEncodingsRef, Opc.second, Operands, FilterChooser FC(NumberedEncodingsRef, Opc.second, Operands,

8 * Opc.first.second, this); IsVarLenInst ? MaxInstLen : 8 * Opc.first.second, this);

// The decode table is cleared for each top level decoder function. The // The decode table is cleared for each top level decoder function. The

// predicates and decoders themselves, however, are shared across all // predicates and decoders themselves, however, are shared across all

// decoders to give more opportunities for uniqueing. // decoders to give more opportunities for uniqueing.

TableInfo.Table.clear(); TableInfo.Table.clear();

TableInfo.FixupStack.clear(); TableInfo.FixupStack.clear();

TableInfo.Table.reserve(16384); TableInfo.Table.reserve(16384);

TableInfo.FixupStack.emplace_back(); TableInfo.FixupStack.emplace_back();

FC.emitTableEntries(TableInfo); FC.emitTableEntries(TableInfo);

// Any NumToSkip fixups in the top level scope can resolve to the // Any NumToSkip fixups in the top level scope can resolve to the

// OPC_Fail at the end of the table. // OPC_Fail at the end of the table.

assert(TableInfo.FixupStack.size() == 1 && "fixup stack phasing error!"); assert(TableInfo.FixupStack.size() == 1 && "fixup stack phasing error!");

// Resolve any NumToSkip fixups in the current scope. // Resolve any NumToSkip fixups in the current scope.

resolveTableFixups(TableInfo.Table, TableInfo.FixupStack.back(), resolveTableFixups(TableInfo.Table, TableInfo.FixupStack.back(),

TableInfo.Table.size()); TableInfo.Table.size());

TableInfo.FixupStack.clear(); TableInfo.FixupStack.clear();

TableInfo.Table.push_back(MCD::OPC_Fail); TableInfo.Table.push_back(MCD::OPC_Fail);

// Print the table to the output stream. // Print the table to the output stream.

emitTable(OS, TableInfo.Table, 0, FC.getBitWidth(), Opc.first.first); emitTable(OS, TableInfo.Table, 0, FC.getBitWidth(), Opc.first.first);

OS.flush(); OS.flush();

} }

// For variable instruction, we emit a instruction length table

// to let the decoder know how long the instructions are.

myhsuUnsubmitted

Done

Could we add some comments here?

myhsu: Could we add some comments here?

// You can see example usage in M68k's disassembler.

if (IsVarLenInst)

emitInstrLenTable(OS, InstrLen);

// Emit the predicate function. // Emit the predicate function.

emitPredicateFunction(OS, TableInfo.Predicates, 0); emitPredicateFunction(OS, TableInfo.Predicates, 0);

// Emit the decoder function. // Emit the decoder function.

emitDecoderFunction(OS, TableInfo.Decoders, 0); emitDecoderFunction(OS, TableInfo.Decoders, 0);

// Emit the main entry point for the decoder, decodeInstruction(). // Emit the main entry point for the decoder, decodeInstruction().

emitDecodeInstruction(OS); emitDecodeInstruction(OS, IsVarLenInst);

OS << "\n} // end namespace llvm\n"; OS << "\n} // end namespace llvm\n";

} }

namespace llvm { namespace llvm {

void EmitFixedLenDecoder(RecordKeeper &RK, raw_ostream &OS, void EmitFixedLenDecoder(RecordKeeper &RK, raw_ostream &OS,

const std::string &PredicateNamespace, const std::string &PredicateNamespace,

const std::string &GPrefix, const std::string &GPrefix,

const std::string &GPostfix, const std::string &ROK, const std::string &GPostfix, const std::string &ROK,

const std::string &RFail, const std::string &L) { const std::string &RFail, const std::string &L) {

FixedLenDecoderEmitter(RK, PredicateNamespace, GPrefix, GPostfix, FixedLenDecoderEmitter(RK, PredicateNamespace, GPrefix, GPostfix,

ROK, RFail, L).run(OS); ROK, RFail, L).run(OS);

} }

} // end namespace llvm } // end namespace llvm

llvm/utils/TableGen/VarLenCodeEmitterGen.h

	//===- VarLenCodeEmitterGen.h - CEG for variable-length insts ---- C++ --===//			//===- VarLenCodeEmitterGen.h - CEG for variable-length insts ---- C++ --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// This file declare the CodeEmitterGen component for variable-length			// This file declare the CodeEmitterGen component for variable-length
	// instructions. See the .cpp file for more details.			// instructions. See the .cpp file for more details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_UTILS_TABLEGEN_VARLENCODEEMITTERGEN_H			#ifndef LLVM_UTILS_TABLEGEN_VARLENCODEEMITTERGEN_H
	#define LLVM_UTILS_TABLEGEN_VARLENCODEEMITTERGEN_H			#define LLVM_UTILS_TABLEGEN_VARLENCODEEMITTERGEN_H

				#include "llvm/TableGen/Record.h"

	namespace llvm {			namespace llvm {

	class RecordKeeper;			struct EncodingSegment {
				myhsuUnsubmitted Done Reply Inline Actions I think these forward declarations are obsolete now myhsu: I think these forward declarations are obsolete now
	class raw_ostream;			unsigned BitWidth;
				const Init *Value;
				StringRef CustomEncoder = "";
				};

				class VarLenInst {
				const RecordVal *TheDef;
				size_t NumBits;

				// Set if any of the segment is not fixed value.
				bool HasDynamicSegment;

				SmallVector<EncodingSegment, 4> Segments;

				void buildRec(const DagInit *DI);

				StringRef getCustomEncoderName(const Init *EI) const {
				if (const auto *DI = dyn_cast<DagInit>(EI)) {
				if (DI->getNumArgs() && isa<StringInit>(DI->getArg(0)))
				return cast<StringInit>(DI->getArg(0))->getValue();
				}
				return "";
				}

				public:
				VarLenInst() : TheDef(nullptr), NumBits(0U), HasDynamicSegment(false) {}

				explicit VarLenInst(const DagInit DI, const RecordVal TheDef);

				/// Number of bits
				size_t size() const { return NumBits; }

				using const_iterator = decltype(Segments)::const_iterator;

				const_iterator begin() const { return Segments.begin(); }
				const_iterator end() const { return Segments.end(); }
				size_t getNumSegments() const { return Segments.size(); }

				bool isFixedValueOnly() const { return !HasDynamicSegment; }
				};

	void emitVarLenCodeEmitter(RecordKeeper &R, raw_ostream &OS);			void emitVarLenCodeEmitter(RecordKeeper &R, raw_ostream &OS);

	} // end namespace llvm			} // end namespace llvm
	#endif			#endif

llvm/utils/TableGen/VarLenCodeEmitterGen.cpp

	Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
	#include "CodeGenHwModes.h"			#include "CodeGenHwModes.h"
	#include "CodeGenInstruction.h"			#include "CodeGenInstruction.h"
	#include "CodeGenTarget.h"			#include "CodeGenTarget.h"
	#include "InfoByHwMode.h"			#include "InfoByHwMode.h"
	#include "llvm/ADT/ArrayRef.h"			#include "llvm/ADT/ArrayRef.h"
	#include "llvm/ADT/DenseMap.h"			#include "llvm/ADT/DenseMap.h"
	#include "llvm/Support/raw_ostream.h"			#include "llvm/Support/raw_ostream.h"
	#include "llvm/TableGen/Error.h"			#include "llvm/TableGen/Error.h"
	#include "llvm/TableGen/Record.h"

	using namespace llvm;			using namespace llvm;

	namespace {			namespace {

	class VarLenCodeEmitterGen {			class VarLenCodeEmitterGen {
	RecordKeeper &Records;			RecordKeeper &Records;

	struct EncodingSegment {
	unsigned BitWidth;
	const Init *Value;
	StringRef CustomEncoder = "";
	};

	class VarLenInst {
	RecordVal *TheDef;
	size_t NumBits;

	// Set if any of the segment is not fixed value.
	bool HasDynamicSegment;

	SmallVector<EncodingSegment, 4> Segments;

	void buildRec(const DagInit *DI);

	StringRef getCustomEncoderName(const Init *EI) const {
	if (const auto *DI = dyn_cast<DagInit>(EI)) {
	if (DI->getNumArgs() && isa<StringInit>(DI->getArg(0)))
	return cast<StringInit>(DI->getArg(0))->getValue();
	}
	return "";
	}

	public:
	VarLenInst() : TheDef(nullptr), NumBits(0U), HasDynamicSegment(false) {}

	explicit VarLenInst(const DagInit DI, RecordVal TheDef);

	/// Number of bits
	size_t size() const { return NumBits; }

	using const_iterator = decltype(Segments)::const_iterator;

	const_iterator begin() const { return Segments.begin(); }
	const_iterator end() const { return Segments.end(); }
	size_t getNumSegments() const { return Segments.size(); }

	bool isFixedValueOnly() const { return !HasDynamicSegment; }
	};

	DenseMap<Record *, VarLenInst> VarLenInsts;			DenseMap<Record *, VarLenInst> VarLenInsts;

	// Emit based values (i.e. fixed bits in the encoded instructions)			// Emit based values (i.e. fixed bits in the encoded instructions)
	void emitInstructionBaseValues(			void emitInstructionBaseValues(
	raw_ostream &OS,			raw_ostream &OS,
	ArrayRef<const CodeGenInstruction *> NumberedInstructions,			ArrayRef<const CodeGenInstruction *> NumberedInstructions,
	CodeGenTarget &Target, int HwMode = -1);			CodeGenTarget &Target, int HwMode = -1);

	std::string getInstructionCase(Record *R, CodeGenTarget &Target);			std::string getInstructionCase(Record *R, CodeGenTarget &Target);
	std::string getInstructionCaseForEncoding(Record R, Record EncodingDef,			std::string getInstructionCaseForEncoding(Record R, Record EncodingDef,
	CodeGenTarget &Target);			CodeGenTarget &Target);

	public:			public:
	explicit VarLenCodeEmitterGen(RecordKeeper &R) : Records(R) {}			explicit VarLenCodeEmitterGen(RecordKeeper &R) : Records(R) {}

	void run(raw_ostream &OS);			void run(raw_ostream &OS);
	};			};

	} // end anonymous namespace			} // end anonymous namespace

	VarLenCodeEmitterGen::VarLenInst::VarLenInst(const DagInit *DI,			VarLenInst::VarLenInst(const DagInit DI, const RecordVal TheDef)
	RecordVal *TheDef)
	: TheDef(TheDef), NumBits(0U) {			: TheDef(TheDef), NumBits(0U) {
	buildRec(DI);			buildRec(DI);
	for (const auto &S : Segments)			for (const auto &S : Segments)
	NumBits += S.BitWidth;			NumBits += S.BitWidth;
	}			}

	void VarLenCodeEmitterGen::VarLenInst::buildRec(const DagInit *DI) {			void VarLenInst::buildRec(const DagInit *DI) {
	assert(TheDef && "The def record is nullptr ?");			assert(TheDef && "The def record is nullptr ?");

	std::string Op = DI->getOperator()->getAsString();			std::string Op = DI->getOperator()->getAsString();

	if (Op == "ascend" \|\| Op == "descend") {			if (Op == "ascend" \|\| Op == "descend") {
	bool Reverse = Op == "descend";			bool Reverse = Op == "descend";
	int i = Reverse ? DI->getNumArgs() - 1 : 0;			int i = Reverse ? DI->getNumArgs() - 1 : 0;
	int e = Reverse ? -1 : DI->getNumArgs();			int e = Reverse ? -1 : DI->getNumArgs();
	▲ Show 20 Lines • Show All 383 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[TableGen] Add support for variable length instruction in decoder generatorClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 426489

llvm/test/TableGen/VarLenDecoder.td

llvm/utils/TableGen/FixedLenDecoderEmitter.cpp

llvm/utils/TableGen/VarLenCodeEmitterGen.h

llvm/utils/TableGen/VarLenCodeEmitterGen.cpp

[TableGen] Add support for variable length instruction in decoder generator
ClosedPublic