This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AArch64/
-
Target/
-
AArch64/
1/4
AArch64RegisterInfo.td
-
AArch64SMEInstrInfo.td
-
AsmParser/
4/15
AArch64AsmParser.cpp
-
Disassembler/
2
AArch64Disassembler.cpp
-
MCTargetDesc/
-
AArch64InstPrinter.h
-
AArch64InstPrinter.cpp
-
AArch64MCCodeEmitter.cpp
9
SMEInstrFormats.td
-
test/MC/AArch64/SME/
-
MC/
-
AArch64/
-
SME/
-
zero-diagnostics.s
-
zero.s

Differential D105575

[AArch64][SME] Add zero instruction
ClosedPublic

Authored by c-rhodes on Jul 7 2021, 11:38 AM.

Download Raw Diff

Details

Reviewers

sdesmalen
david-arm
CarolineConcatto
kmclaughlin
dmgreen
ostannard

Commits

rG2e27c4e1f187: [AArch64][SME] Add zero instruction

Summary

This patch adds the zero instruction for zeroing a list of 64-bit
element ZA tiles. The instruction takes a list of up to eight tiles
ZA0.D-ZA7.D, which must be in order, e.g.

zero {za0.d,za1.d,za2.d,za3.d,za4.d,za5.d,za6.d,za7.d}
zero {za1.d,za3.d,za5.d,za7.d}

The assembler also accepts 32-bit, 16-bit and 8-bit element tiles which
are mapped to corresponding 64-bit element tiles in accordance with the
architecturally defined mapping between different element size tiles,
e.g.

Zeroing ZA0.B, or the entire array name ZA, is equivalent to zeroing all eight 64-bit element tiles ZA0.D to ZA7.D.
Zeroing ZA0.S is equivalent to zeroing ZA0.D and ZA4.D.

The preferred disassembly of this instruction uses the shortest list of
tile names that represent the encoded immediate mask, e.g.

An immediate which encodes 64-bit element tiles ZA0.D, ZA1.D, ZA4.D and ZA5.D is disassembled as {ZA0.S, ZA1.S}.
An immediate which encodes 64-bit element tiles ZA0.D, ZA2.D, ZA4.D and ZA6.D is disassembled as {ZA0.H}.
An all-ones immediate is disassembled as {ZA}.
An all-zeros immediate is disassembled as an empty list {}.

This patch adds the MatrixTileList asm operand and related parsing to support
this.

Depends on D105570.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Diff Detail

Event Timeline

c-rhodes created this revision.Jul 7 2021, 11:38 AM

Herald added subscribers: danielkiss, hiraditya, kristof.beyls. · View Herald TranscriptJul 7 2021, 11:38 AM

c-rhodes requested review of this revision.Jul 7 2021, 11:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2021, 11:38 AM

c-rhodes added a parent revision: D105574: [AArch64][SME] Add mova instructions.Jul 7 2021, 11:38 AM

Harbormaster completed remote builds in B112830: Diff 357023.Jul 7 2021, 11:38 AM

c-rhodes added a child revision: D105576: [AArch64][SME] Add system registers and related instructions.Jul 7 2021, 11:40 AM

Matt added a subscriber: Matt.Jul 7 2021, 3:02 PM

tschuett added a subscriber: tschuett.Jul 7 2021, 11:59 PM

Changes:

Updated sed line in tests to fix Windows precommit.
Fixed bug in parsing of tile list that permitted matrix operand with row or column indicator, e.g. zero {za0h.b}.

Harbormaster completed remote builds in B113982: Diff 358605.Jul 14 2021, 7:53 AM

c-rhodes edited the summary of this revision. (Show Details)Jul 14 2021, 7:55 AM

c-rhodes added a parent revision: D105570: [AArch64][SME] Add matrix register definitions and parsing support.

c-rhodes removed a parent revision: D105574: [AArch64][SME] Add mova instructions.

c-rhodes removed a child revision: D105576: [AArch64][SME] Add system registers and related instructions.Jul 15 2021, 5:57 AM

c-rhodes edited the summary of this revision. (Show Details)Jul 19 2021, 3:27 AM

Hi @c-rhodes, I'm only part-way through the patch, but here are some minor comments I have so far!

llvm/lib/Target/AArch64/AArch64RegisterInfo.td
1357	This seems to be unused - can we delete the argument?
llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
2066	What if the pair doesn't make sense? For example, {16, AArch64::ZAD3}. In this case OutRegs will be returned empty.
2285	Given that you've now hard-coded properties about the maximum twice now (once above with RegMask <= 0xFF) and here with MaxBits = 8, is it worth having a constant declared somewhere that you can refer to? For example, MaxBits could be a constant in MatrixTileListOp and then RegMask = (1 << MaxBits ) - 1;
2287	Do we need a separator here, like ' '?
llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
708	Again, another hard-coded value here.

david-arm added inline comments.Jul 19 2021, 6:01 AM

llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
3798	Is 'RegNum' guaranteed to always be > 0 for a valid register?
3803	To be honest it's quite hard to review this because I don't know what parseVectorKind returns. Can we specify the type instead of 'auto'?
llvm/lib/Target/AArch64/SMEInstrFormats.td
688	Are we missing ones for {za1.s,za3.s} and {z0.s,za2.s}?

Hi @c-rhodes, we had done some design work for the ZERO instruction, and it is interesting to see your implementation. I have some questions about the code, based on my understanding of the ISA.

llvm/lib/Target/AArch64/AArch64RegisterInfo.td
1375	Does this restrict a matrix tile list to contain only tiles of the same element type? The ISA documentation doesn't impose such a restriction, so I think it would be legal to write something like `zero { za0.s, za5.d }`. Did you consider supporting matrix tile lists of mixed element types?
llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
3862	Is this specified in the ISA documentation? I thought matrix tile lists of mixed element types were allowed.
3868	Is this specified in the ISA documentation? It isn't obvious that the tiles in the list must be ordered.
llvm/lib/Target/AArch64/SMEInstrFormats.td
661	Wouldn't it be better for the instruction to accept an output `RegisterOperand` that covers all possible tile types, which would allow register allocation to assign the appropriate tile registers? Doing so would also allow us to distinguish between `zero { za0.d, za4.d }` and `zero { za0.s }` when we need to parse assembly code back into machine IR with correct register semantics.
688	Why are these implemented as `InstAlias`? Can they not be parsed by `AArch64AsmParser::tryParseMatrixTileList`?

In D105575#2886914, @david-arm wrote:

Hi @c-rhodes, I'm only part-way through the patch, but here are some minor comments I have so far!

Thanks for the comments Dave, I've not responded to them all yet but I'll update the patch shortly to address them.

In D105575#2887920, @bryanpkc wrote:

Hi @c-rhodes, we had done some design work for the ZERO instruction, and it is interesting to see your implementation. I have some questions about the code, based on my understanding of the ISA.

Hi @bryanpkc, thanks for the comments. Do you also have an implementation for this?

llvm/lib/Target/AArch64/AArch64RegisterInfo.td
1375	Does this restrict a matrix tile list to contain only tiles of the same element type? The ISA documentation doesn't impose such a restriction, so I think it would be legal to write something like `zero { za0.s, za5.d }`. Did you consider supporting matrix tile lists of mixed element types? Yeah the element types must be the same. The ISA docs don't explicitly impose that restriction, but the actual instruction takes a list of 64-bit element types, the other types are really aliases.
llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
3868	Is this specified in the ISA documentation? It isn't obvious that the tiles in the list must be ordered. No, but I'm not sure there's a good reason for the tiles to not be in order?
llvm/lib/Target/AArch64/SMEInstrFormats.td
661	Wouldn't it be better for the instruction to accept an output `RegisterOperand` that covers all possible tile types, which would allow register allocation to assign the appropriate tile registers? Doing so would also allow us to distinguish between `zero { za0.d, za4.d }` and `zero { za0.s }` when we need to parse assembly code back into machine IR with correct register semantics. I've not given that a great deal of thought yet to be honest, our focus is on MC layer support at the moment, may revisit this in the future.
688	Are we missing ones for {za1.s,za3.s} and {z0.s,za2.s}? These aliases are for the preferred disassembly, the constraint is the instruction uses the shortest list of tile names that represent the encoded immediate mask. The parsed tile list gets converted (not necessary if input is .D tiles) to 64-bit tiles and then encoded as an 8-bit mask, a bit for each tile (za0.d-za7.d). For `{za1.s,za3.s}`, the mapping is: za1.s -> {za1.d, za5.d} za3.s -> {za3.d, za7.d} -> {za1.d, za3.d, za5.d, za7.d} (mask: 10101010) the shortest possible tile list for this mask is `{za1.h}` and that's defined above. Same principle applies for `{z0.s,za2.s}`. For reference, the following table describes all possible aliases: The following table describes all possible aliases: mask in preferred 0 0b11111111 zero {za} zero {za} 1 0b11111111 zero {za0.b} zero {za} 2 0b01010101 zero {za0.h} zero {za0.h} 3 0b10101010 zero {za1.h} zero {za1.h} 4 0b11111111 zero {za0.h,za1.h} zero {za} 5 0b00010001 zero {za0.s} zero {za0.s} 6 0b00100010 zero {za1.s} zero {za1.s} 7 0b01000100 zero {za2.s} zero {za2.s} 8 0b10001000 zero {za3.s} zero {za3.s} 9 0b00110011 zero {za0.s,za1.s} zero {za0.s,za1.s} 10 0b01010101 zero {za0.s,za2.s} zero {za0.h} 11 0b10011001 zero {za0.s,za3.s} zero {za0.s,za3.s} 12 0b01100110 zero {za1.s,za2.s} zero {za1.s,za2.s} 13 0b10101010 zero {za1.s,za3.s} zero {za1.h} 14 0b11001100 zero {za2.s,za3.s} zero {za2.s,za3.s} 15 0b01110111 zero {za0.s,za1.s,za2.s} zero {za0.s,za1.s,za2.s} 16 0b10111011 zero {za0.s,za1.s,za3.s} zero {za0.s,za1.s,za3.s} 17 0b11011101 zero {za0.s,za2.s,za3.s} zero {za0.s,za2.s,za3.s} 18 0b11101110 zero {za1.s,za2.s,za3.s} zero {za1.s,za2.s,za3.s} 19 0b11111111 zero {za0.s,za1.s,za2.s,za3.s} zero {za} 20 0b11111111 zero {za0.d,za1.d,za2.d,za3.d,za4.d,za5.d,za6.... zero {za} 21 0b01010101 zero {za0.d,za2.d,za4.d,za6.d} zero {za0.h} 22 0b10101010 zero {za1.d,za3.d,za5.d,za7.d} zero {za1.h} 23 0b00010001 zero {za0.d,za4.d} zero {za0.s} 24 0b00100010 zero {za1.d,za5.d} zero {za1.s} 25 0b01000100 zero {za2.d,za6.d} zero {za2.s} 26 0b10001000 zero {za3.d,za7.d} zero {za3.s} 27 0b00110011 zero {za0.d,za1.d,za4.d,za5.d} zero {za0.s,za1.s} 28 0b10011001 zero {za0.d,za3.d,za4.d,za7.d} zero {za0.s,za3.s} 29 0b01100110 zero {za1.d,za2.d,za5.d,za6.d} zero {za1.s,za2.s} 30 0b11001100 zero {za2.d,za3.d,za6.d,za7.d} zero {za2.s,za3.s} 31 0b01110111 zero {za0.d,za1.d,za2.d,za4.d,za5.d,za6.d} zero {za0.s,za1.s,za2.s} 32 0b10111011 zero {za0.d,za1.d,za3.d,za4.d,za5.d,za7.d} zero {za0.s,za1.s,za3.s} 33 0b11011101 zero {za0.d,za2.d,za3.d,za4.d,za6.d,za7.d} zero {za0.s,za2.s,za3.s} 34 0b11101110 zero {za1.d,za2.d,za3.d,za5.d,za6.d,za7.d} zero {za1.s,za2.s,za3.s}
688	Why are these implemented as `InstAlias`? Can they not be parsed by `AArch64AsmParser::tryParseMatrixTileList`? See my comment below regarding aliases. To expand on the parsing a bit, `tryParseMatrixTileList` will parse tile lists with 8/16/32 or 64-bit element types, the non 64-bit types are treated as aliases and get converted to .D tiles, then encoded as an 8-bit mask.

Hi @bryanpkc, thanks for the comments. Do you also have an implementation for this?

Yes, we implemented the SME instructions internally. Most of our code is very similar to the patches you have upstreamed, but ZERO was tricky and our approach differs quite a bit.

llvm/lib/Target/AArch64/AArch64RegisterInfo.td
1375	I understand that the bits in the immediate operand refer to .d tiles, but it seems useful to allow a programmer to keep referring to the register operands as .h and .d (for example), if that's how the registers are used after they are zeroed.
llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
3868	IMO the assembler should be more forgiving here, e.g. when the user programmatically generates the assembly code and forgets to sort the registers. We may also want to be consistent with the ARM backend, which produces a warning, not an error, in a similar situation: .text foo.s:1:14: warning: register list not in ascending order stm sp!, {r5,r3,r0} ^ foo.s:1:17: warning: register list not in ascending order stm sp!, {r5,r3,r0} ^ stm sp!, {r0, r3, r5}
llvm/lib/Target/AArch64/SMEInstrFormats.td
661	We also went with an immediate operand at first, but eventually replaced it with a register operand, mainly to allow register allocation to work.

c-rhodes added inline comments.Jul 20 2021, 8:49 AM

llvm/lib/Target/AArch64/SMEInstrFormats.td
661	We also went with an immediate operand at first, but eventually replaced it with a register operand, mainly to allow register allocation to work. Your approach sounds more complete, I'd be interested in taking a look, are you able to upstream it?

sdesmalen added inline comments.Jul 20 2021, 10:59 PM

llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
3868	I agree that if the spec doesn't require the operands to be in a specific order, that the instruction should accept the operands in any order.
llvm/lib/Target/AArch64/SMEInstrFormats.td
661	@bryanpkc you make a good point and it would be interested to see those patches! For this patch I think that unless the changes you're suggesting are trivial, it would make sense to have any changes that are not required for the assembler as follow-up patches. I'm a bit cautious about this otherwise holding up SME asm support into LLVM 13, since those changes aren't necessarily required for the assembler.

Address comments.
Replaced MatrixTileList<EltSize> operands with a single operand. Since legal tiles (8/16/32) get mapped to 64-bit tiles then register mask, and the shortest possible tile lists are defined via aliases, individual operands for each elt size isn't necessary and adds complication.

c-rhodes marked 2 inline comments as done.Jul 21 2021, 8:43 AM

c-rhodes added inline comments.

llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
2066	What if the pair doesn't make sense? For example, {16, AArch64::ZAD3}. In this case OutRegs will be returned empty. I've change the match register name function (`matchMatrixTileListRegName`) to only match tiles that are valid for tile lists, now we shouldn't end up here I've added an assert that the pair isn't empty.
2287	Do we need a separator here, like ' '? I don't think so, this just emits the bits, e.g. `zero {za0.s, za2.s} -> <matrixlist 01010101` although I realise now the closing `>` is missing, I'll fix that.
3798	Is 'RegNum' guaranteed to always be > 0 for a valid register? Yeah 0 is NoRegister, defined in `llvm/include/llvm/MC/MCRegister.h`: `static constexpr unsigned NoRegister = 0u;` I suppose the match register functions could default on NoRegister to be more explicit, but not sure how useful that is.
3868	IMO the assembler should be more forgiving here, e.g. when the user programmatically generates the assembly code and forgets to sort the registers. We may also want to be consistent with the ARM backend, which produces a warning, not an error, in a similar situation: .text foo.s:1:14: warning: register list not in ascending order stm sp!, {r5,r3,r0} ^ foo.s:1:17: warning: register list not in ascending order stm sp!, {r5,r3,r0} ^ stm sp!, {r0, r3, r5} Fair point, I've changed it to a warning.
llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
708	Again, another hard-coded value here. not really sure where I could put a constant here

Harbormaster completed remote builds in B115329: Diff 360471.Jul 21 2021, 9:54 AM

@bryanpkc you make a good point and it would be interested to see those patches!

For this patch I think that unless the changes you're suggesting are trivial, it would make sense to have any changes that are not required for the assembler as follow-up patches. I'm a bit cautious about this otherwise holding up SME asm support into LLVM 13, since those changes aren't necessarily required for the assembler.

@c-rhodes @sdesmalen I'm working on getting approvals to open our code. I don't know exactly how long it will take, but hopefully it won't be too long. I agree that we can consider that issue in a follow-up patch.

In D105575#2895430, @bryanpkc wrote:

@bryanpkc you make a good point and it would be interested to see those patches!

For this patch I think that unless the changes you're suggesting are trivial, it would make sense to have any changes that are not required for the assembler as follow-up patches. I'm a bit cautious about this otherwise holding up SME asm support into LLVM 13, since those changes aren't necessarily required for the assembler.

@c-rhodes @sdesmalen I'm working on getting approvals to open our code. I don't know exactly how long it will take, but hopefully it won't be too long. I agree that we can consider that issue in a follow-up patch.

Great, thanks @bryanpkc

LGTM! Thanks for dealing with all the comments @c-rhodes!

llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
2287	nit: I think it's still missing a closing '>'

This revision is now accepted and ready to land.Jul 26 2021, 7:05 AM

This revision was landed with ongoing or failed builds.Jul 27 2021, 1:36 AM

Closed by commit rG2e27c4e1f187: [AArch64][SME] Add zero instruction (authored by c-rhodes). · Explain Why

This revision was automatically updated to reflect the committed changes.

c-rhodes added a commit: rG2e27c4e1f187: [AArch64][SME] Add zero instruction.

Revision Contents

Path

Size

llvm/

lib/

Target/

AArch64/

AArch64RegisterInfo.td

20 lines

AArch64SMEInstrInfo.td

6 lines

AsmParser/

AArch64AsmParser.cpp

197 lines

Disassembler/

AArch64Disassembler.cpp

14 lines

MCTargetDesc/

AArch64InstPrinter.h

4 lines

AArch64InstPrinter.cpp

38 lines

AArch64MCCodeEmitter.cpp

11 lines

SMEInstrFormats.td

39 lines

test/

MC/

AArch64/

SME/

zero-diagnostics.s

76 lines

zero.s

250 lines

Diff 357023

llvm/lib/Target/AArch64/AArch64RegisterInfo.td

	Show First 20 Lines • Show All 1,348 Lines • ▼ Show 20 Lines

	class MatrixOperand<RegisterClass RC, int EltSize> : RegisterOperand<RC> {			class MatrixOperand<RegisterClass RC, int EltSize> : RegisterOperand<RC> {
	let ParserMatchClass = MatrixAsmOperand<!cast<string>(RC), EltSize>;			let ParserMatchClass = MatrixAsmOperand<!cast<string>(RC), EltSize>;
	let PrintMethod = "printMatrix<" # EltSize # ">";			let PrintMethod = "printMatrix<" # EltSize # ">";
	}			}

	def MatrixOp : MatrixOperand<MPR, 0>;			def MatrixOp : MatrixOperand<MPR, 0>;

				class MatrixTileListAsmOperand<string RC, int EltSize> : AsmOperandClass {
				david-armUnsubmitted Done Reply Inline Actions This seems to be unused - can we delete the argument? david-arm: This seems to be unused - can we delete the argument?
				let Name = "MatrixTileList" # EltSize;
				let ParserMethod = "tryParseMatrixTileList";
				let RenderMethod = "addMatrixTileListOperands";
				let PredicateMethod = "isMatrixTileListOperand<" # EltSize # ">";
				}

				class MatrixTileListOperand<string RC, int EltSize, dag ops> : Operand<i32> {
				let ParserMatchClass = MatrixTileListAsmOperand<RC, EltSize>;
				let DecoderMethod = "DecodeMatrixTileListRegisterClass";
				let EncoderMethod = "EncodeMatrixTileListRegisterClass";
				let PrintMethod = "printMatrixTileList<" # EltSize # ">";
				let MIOperandInfo = ops;
				}

				def MatrixTileList8 : MatrixTileListOperand<"MPR8", 8, (ops MPR8)>;
				def MatrixTileList16 : MatrixTileListOperand<"MPR16", 16, (ops MPR16)>;
				def MatrixTileList32 : MatrixTileListOperand<"MPR32", 32, (ops MPR32)>;
				def MatrixTileList64 : MatrixTileListOperand<"MPR64", 64, (ops MPR64)>;
				bryanpkcUnsubmitted Not Done Reply Inline Actions Does this restrict a matrix tile list to contain only tiles of the same element type? The ISA documentation doesn't impose such a restriction, so I think it would be legal to write something like `zero { za0.s, za5.d }`. Did you consider supporting matrix tile lists of mixed element types? bryanpkc: Does this restrict a matrix tile list to contain only tiles of the same element type? The ISA…
				c-rhodesAuthorUnsubmitted Not Done Reply Inline Actions Does this restrict a matrix tile list to contain only tiles of the same element type? The ISA documentation doesn't impose such a restriction, so I think it would be legal to write something like `zero { za0.s, za5.d }`. Did you consider supporting matrix tile lists of mixed element types? Yeah the element types must be the same. The ISA docs don't explicitly impose that restriction, but the actual instruction takes a list of 64-bit element types, the other types are really aliases. c-rhodes: > Does this restrict a matrix tile list to contain only tiles of the same element type? The ISA…
				bryanpkcUnsubmitted Not Done Reply Inline Actions I understand that the bits in the immediate operand refer to .d tiles, but it seems useful to allow a programmer to keep referring to the register operands as .h and .d (for example), if that's how the registers are used after they are zeroed. bryanpkc: I understand that the bits in the immediate operand refer to .d tiles, but it seems useful to…

	def MatrixIndexGPR32_12_15 : RegisterClass<"AArch64", [i32], 32, (sequence "W%u", 12, 15)>;			def MatrixIndexGPR32_12_15 : RegisterClass<"AArch64", [i32], 32, (sequence "W%u", 12, 15)>;
	def MatrixIndexGPR32Op12_15 : RegisterOperand<MatrixIndexGPR32_12_15> {			def MatrixIndexGPR32Op12_15 : RegisterOperand<MatrixIndexGPR32_12_15> {
	let EncoderMethod = "EncodeMatrixIndexGPR32";			let EncoderMethod = "EncodeMatrixIndexGPR32";
	}			}

llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td

	Show First 20 Lines • Show All 82 Lines • ▼ Show 20 Lines

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Move instructions			// Move instructions
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	defm INSERT_MXIPZ : sme_vector_to_tile<"mova">;			defm INSERT_MXIPZ : sme_vector_to_tile<"mova">;
	defm EXTRACT_ZPMXI : sme_tile_to_vector<"mova">;			defm EXTRACT_ZPMXI : sme_tile_to_vector<"mova">;

				//===----------------------------------------------------------------------===//
				// Zero instruction
				//===----------------------------------------------------------------------===//

				defm ZERO_M : sme_zero<"zero">;

	} // End let Predicates = [HasSME]			} // End let Predicates = [HasSME]

llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp

Show All 12 Lines
#include "MCTargetDesc/AArch64TargetStreamer.h"		#include "MCTargetDesc/AArch64TargetStreamer.h"
#include "TargetInfo/AArch64TargetInfo.h"		#include "TargetInfo/AArch64TargetInfo.h"
#include "AArch64InstrInfo.h"		#include "AArch64InstrInfo.h"
#include "Utils/AArch64BaseInfo.h"		#include "Utils/AArch64BaseInfo.h"
#include "llvm/ADT/APFloat.h"		#include "llvm/ADT/APFloat.h"
#include "llvm/ADT/APInt.h"		#include "llvm/ADT/APInt.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringMap.h"		#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/MC/MCContext.h"		#include "llvm/MC/MCContext.h"
#include "llvm/MC/MCExpr.h"		#include "llvm/MC/MCExpr.h"
▲ Show 20 Lines • Show All 232 Lines • ▼ Show 20 Lines	template <bool ParseShiftExtend,
RegConstraintEqualityTy EqTy = RegConstraintEqualityTy::EqualsReg>		RegConstraintEqualityTy EqTy = RegConstraintEqualityTy::EqualsReg>
OperandMatchResultTy tryParseGPROperand(OperandVector &Operands);		OperandMatchResultTy tryParseGPROperand(OperandVector &Operands);
template <bool ParseShiftExtend, bool ParseSuffix>		template <bool ParseShiftExtend, bool ParseSuffix>
OperandMatchResultTy tryParseSVEDataVector(OperandVector &Operands);		OperandMatchResultTy tryParseSVEDataVector(OperandVector &Operands);
OperandMatchResultTy tryParseSVEPredicateVector(OperandVector &Operands);		OperandMatchResultTy tryParseSVEPredicateVector(OperandVector &Operands);
template <RegKind VectorKind>		template <RegKind VectorKind>
OperandMatchResultTy tryParseVectorList(OperandVector &Operands,		OperandMatchResultTy tryParseVectorList(OperandVector &Operands,
bool ExpectMatch = false);		bool ExpectMatch = false);
		OperandMatchResultTy tryParseMatrixTileList(OperandVector &Operands);
OperandMatchResultTy tryParseSVEPattern(OperandVector &Operands);		OperandMatchResultTy tryParseSVEPattern(OperandVector &Operands);
OperandMatchResultTy tryParseGPR64x8(OperandVector &Operands);		OperandMatchResultTy tryParseGPR64x8(OperandVector &Operands);

public:		public:
enum AArch64MatchResultTy {		enum AArch64MatchResultTy {
Match_InvalidSuffix = FIRST_TARGET_MATCH_RESULT_TY,		Match_InvalidSuffix = FIRST_TARGET_MATCH_RESULT_TY,
#define GET_OPERAND_DIAGNOSTIC_TYPES		#define GET_OPERAND_DIAGNOSTIC_TYPES
#include "AArch64GenAsmMatcher.inc"		#include "AArch64GenAsmMatcher.inc"
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
class AArch64Operand : public MCParsedAsmOperand {		class AArch64Operand : public MCParsedAsmOperand {
private:		private:
enum KindTy {		enum KindTy {
k_Immediate,		k_Immediate,
k_ShiftedImm,		k_ShiftedImm,
k_CondCode,		k_CondCode,
k_Register,		k_Register,
k_MatrixRegister,		k_MatrixRegister,
		k_MatrixTileList,
k_VectorList,		k_VectorList,
k_VectorIndex,		k_VectorIndex,
k_Token,		k_Token,
k_SysReg,		k_SysReg,
k_SysCR,		k_SysCR,
k_Prefetch,		k_Prefetch,
k_ShiftExtend,		k_ShiftExtend,
k_FPImm,		k_FPImm,
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	private:
};		};

struct MatrixRegOp {		struct MatrixRegOp {
unsigned RegNum;		unsigned RegNum;
int ElementWidth;		int ElementWidth;
MatrixKind Kind;		MatrixKind Kind;
};		};

		struct MatrixTileListOp {
		unsigned RegMask = 0;
		unsigned ElementWidth;
		};

struct VectorListOp {		struct VectorListOp {
unsigned RegNum;		unsigned RegNum;
unsigned Count;		unsigned Count;
unsigned NumElements;		unsigned NumElements;
unsigned ElementWidth;		unsigned ElementWidth;
RegKind RegisterKind;		RegKind RegisterKind;
};		};

▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	struct BTIHintOp {
unsigned Length;		unsigned Length;
unsigned Val;		unsigned Val;
};		};

union {		union {
struct TokOp Tok;		struct TokOp Tok;
struct RegOp Reg;		struct RegOp Reg;
struct MatrixRegOp MatrixReg;		struct MatrixRegOp MatrixReg;
		struct MatrixTileListOp MatrixTileList;
struct VectorListOp VectorList;		struct VectorListOp VectorList;
struct VectorIndexOp VectorIndex;		struct VectorIndexOp VectorIndex;
struct ImmOp Imm;		struct ImmOp Imm;
struct ShiftedImmOp ShiftedImm;		struct ShiftedImmOp ShiftedImm;
struct CondCodeOp CondCode;		struct CondCodeOp CondCode;
struct FPImmOp FPImm;		struct FPImmOp FPImm;
struct BarrierOp Barrier;		struct BarrierOp Barrier;
struct SysRegOp SysReg;		struct SysRegOp SysReg;
Show All 35 Lines	case k_Barrier:
Barrier = o.Barrier;		Barrier = o.Barrier;
break;		break;
case k_Register:		case k_Register:
Reg = o.Reg;		Reg = o.Reg;
break;		break;
case k_MatrixRegister:		case k_MatrixRegister:
MatrixReg = o.MatrixReg;		MatrixReg = o.MatrixReg;
break;		break;
		case k_MatrixTileList:
		MatrixTileList = o.MatrixTileList;
		break;
case k_VectorList:		case k_VectorList:
VectorList = o.VectorList;		VectorList = o.VectorList;
break;		break;
case k_VectorIndex:		case k_VectorIndex:
VectorIndex = o.VectorIndex;		VectorIndex = o.VectorIndex;
break;		break;
case k_SysReg:		case k_SysReg:
SysReg = o.SysReg;		SysReg = o.SysReg;
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	unsigned getMatrixElementWidth() const {
return MatrixReg.ElementWidth;		return MatrixReg.ElementWidth;
}		}

MatrixKind getMatrixKind() const {		MatrixKind getMatrixKind() const {
assert(Kind == k_MatrixRegister && "Invalid access!");		assert(Kind == k_MatrixRegister && "Invalid access!");
return MatrixReg.Kind;		return MatrixReg.Kind;
}		}

		unsigned getMatrixTileListRegMask() const {
		assert(isMatrixTileList() && "Invalid access!");
		return MatrixTileList.RegMask;
		}

		unsigned getMatrixTileListElementWidth() const {
		assert(isMatrixTileList() && "Invalid access!");
		return MatrixTileList.ElementWidth;
		}

RegConstraintEqualityTy getRegEqualityTy() const {		RegConstraintEqualityTy getRegEqualityTy() const {
assert(Kind == k_Register && "Invalid access!");		assert(Kind == k_Register && "Invalid access!");
return Reg.EqualityTy;		return Reg.EqualityTy;
}		}

unsigned getVectorListStart() const {		unsigned getVectorListStart() const {
assert(Kind == k_VectorList && "Invalid access!");		assert(Kind == k_VectorList && "Invalid access!");
return VectorList.RegNum;		return VectorList.RegNum;
▲ Show 20 Lines • Show All 494 Lines • ▼ Show 20 Lines	bool isNeonVectorRegLo() const {
return Kind == k_Register && Reg.Kind == RegKind::NeonVector &&		return Kind == k_Register && Reg.Kind == RegKind::NeonVector &&
(AArch64MCRegisterClasses[AArch64::FPR128_loRegClassID].contains(		(AArch64MCRegisterClasses[AArch64::FPR128_loRegClassID].contains(
Reg.RegNum) \|\|		Reg.RegNum) \|\|
AArch64MCRegisterClasses[AArch64::FPR64_loRegClassID].contains(		AArch64MCRegisterClasses[AArch64::FPR64_loRegClassID].contains(
Reg.RegNum));		Reg.RegNum));
}		}

bool isMatrix() const { return Kind == k_MatrixRegister; }		bool isMatrix() const { return Kind == k_MatrixRegister; }
		bool isMatrixTileList() const { return Kind == k_MatrixTileList; }

template <unsigned Class> bool isSVEVectorReg() const {		template <unsigned Class> bool isSVEVectorReg() const {
RegKind RK;		RegKind RK;
switch (Class) {		switch (Class) {
case AArch64::ZPRRegClassID:		case AArch64::ZPRRegClassID:
case AArch64::ZPR_3bRegClassID:		case AArch64::ZPR_3bRegClassID:
case AArch64::ZPR_4bRegClassID:		case AArch64::ZPR_4bRegClassID:
RK = RegKind::SVEDataVector;		RK = RegKind::SVEDataVector;
▲ Show 20 Lines • Show All 366 Lines • ▼ Show 20 Lines	if (const MCConstantExpr *CE = dyn_cast<MCConstantExpr>(Imm.Val)) {
int64_t Min = - (1LL << (21 - 1));		int64_t Min = - (1LL << (21 - 1));
int64_t Max = ((1LL << (21 - 1)) - 1);		int64_t Max = ((1LL << (21 - 1)) - 1);
return Val >= Min && Val <= Max;		return Val >= Min && Val <= Max;
}		}

return true;		return true;
}		}

		template <int EltSize> DiagnosticPredicate isMatrixTileListOperand() const {
		if (!isMatrixTileList())
		return DiagnosticPredicateTy::NoMatch;
		if (EltSize != getMatrixTileListElementWidth())
		return DiagnosticPredicateTy::NoMatch;
		return DiagnosticPredicateTy::Match;
		}

template <MatrixKind Kind, int EltSize, unsigned RegClass>		template <MatrixKind Kind, int EltSize, unsigned RegClass>
DiagnosticPredicate isMatrixRegOperand() const {		DiagnosticPredicate isMatrixRegOperand() const {
if (isMatrix() && getMatrixKind() == Kind &&		if (isMatrix() && getMatrixKind() == Kind &&
AArch64MCRegisterClasses[RegClass].contains(getMatrixReg()) &&		AArch64MCRegisterClasses[RegClass].contains(getMatrixReg()) &&
EltSize == getMatrixElementWidth())		EltSize == getMatrixElementWidth())
return DiagnosticPredicateTy::Match;		return DiagnosticPredicateTy::Match;
return DiagnosticPredicateTy::NoMatch;		return DiagnosticPredicateTy::NoMatch;
}		}
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	void addVectorListOperands(MCInst &Inst, unsigned N) const {
assert((RegTy != VecListIdx_ZReg \|\| NumRegs <= 4) &&		assert((RegTy != VecListIdx_ZReg \|\| NumRegs <= 4) &&
" NumRegs must be <= 4 for ZRegs");		" NumRegs must be <= 4 for ZRegs");

unsigned FirstReg = FirstRegs[(unsigned)RegTy][NumRegs];		unsigned FirstReg = FirstRegs[(unsigned)RegTy][NumRegs];
Inst.addOperand(MCOperand::createReg(FirstReg + getVectorListStart() -		Inst.addOperand(MCOperand::createReg(FirstReg + getVectorListStart() -
FirstRegs[(unsigned)RegTy][0]));		FirstRegs[(unsigned)RegTy][0]));
}		}

		void addMatrixTileListOperands(MCInst &Inst, unsigned N) const {
		assert(N == 1 && "Invalid number of operands!");
		unsigned RegMask = getMatrixTileListRegMask();
		assert(RegMask <= 0xFF && "Invalid mask!");
		Inst.addOperand(MCOperand::createImm(RegMask));
		}

void addVectorIndexOperands(MCInst &Inst, unsigned N) const {		void addVectorIndexOperands(MCInst &Inst, unsigned N) const {
assert(N == 1 && "Invalid number of operands!");		assert(N == 1 && "Invalid number of operands!");
Inst.addOperand(MCOperand::createImm(getVectorIndex()));		Inst.addOperand(MCOperand::createImm(getVectorIndex()));
}		}

template <unsigned ImmIs0, unsigned ImmIs1>		template <unsigned ImmIs0, unsigned ImmIs1>
void addExactFPImmOperands(MCInst &Inst, unsigned N) const {		void addExactFPImmOperands(MCInst &Inst, unsigned N) const {
assert(N == 1 && "Invalid number of operands!");		assert(N == 1 && "Invalid number of operands!");
▲ Show 20 Lines • Show All 347 Lines • ▼ Show 20 Lines
CreateVectorIndex(int Idx, SMLoc S, SMLoc E, MCContext &Ctx) {		CreateVectorIndex(int Idx, SMLoc S, SMLoc E, MCContext &Ctx) {
auto Op = std::make_unique<AArch64Operand>(k_VectorIndex, Ctx);		auto Op = std::make_unique<AArch64Operand>(k_VectorIndex, Ctx);
Op->VectorIndex.Val = Idx;		Op->VectorIndex.Val = Idx;
Op->StartLoc = S;		Op->StartLoc = S;
Op->EndLoc = E;		Op->EndLoc = E;
return Op;		return Op;
}		}

		static std::unique_ptr<AArch64Operand>
		CreateMatrixTileList(SmallSet<unsigned, 8> DRegs, unsigned ElementWidth,
		SMLoc S, SMLoc E, MCContext &Ctx) {
		auto Op = std::make_unique<AArch64Operand>(k_MatrixTileList, Ctx);
		const MCRegisterInfo *RI = Ctx.getRegisterInfo();
		unsigned RegMask = 0;
		for (auto Reg : DRegs)
		RegMask \|= 0x1 << (RI->getEncodingValue(Reg) -
		RI->getEncodingValue(AArch64::ZAD0));
		Op->MatrixTileList.RegMask = RegMask;
		Op->MatrixTileList.ElementWidth = ElementWidth;
		Op->StartLoc = S;
		Op->EndLoc = E;
		return Op;
		}

		static void ComputeRegsForAlias(unsigned Reg, SmallSet<unsigned, 8> &OutRegs,
		const unsigned ElementWidth) {
		static std::map<std::pair<unsigned, unsigned>, std::vector<unsigned>>
		RegMap = {
		{{0, AArch64::ZAB0},
		{AArch64::ZAD0, AArch64::ZAD1, AArch64::ZAD2, AArch64::ZAD3,
		AArch64::ZAD4, AArch64::ZAD5, AArch64::ZAD6, AArch64::ZAD7}},
		{{8, AArch64::ZAB0},
		{AArch64::ZAD0, AArch64::ZAD1, AArch64::ZAD2, AArch64::ZAD3,
		AArch64::ZAD4, AArch64::ZAD5, AArch64::ZAD6, AArch64::ZAD7}},
		{{16, AArch64::ZAH0},
		{AArch64::ZAD0, AArch64::ZAD2, AArch64::ZAD4, AArch64::ZAD6}},
		{{16, AArch64::ZAH1},
		{AArch64::ZAD1, AArch64::ZAD3, AArch64::ZAD5, AArch64::ZAD7}},
		{{32, AArch64::ZAS0}, {AArch64::ZAD0, AArch64::ZAD4}},
		{{32, AArch64::ZAS1}, {AArch64::ZAD1, AArch64::ZAD5}},
		{{32, AArch64::ZAS2}, {AArch64::ZAD2, AArch64::ZAD6}},
		{{32, AArch64::ZAS3}, {AArch64::ZAD3, AArch64::ZAD7}},
		};

		if (ElementWidth == 64)
		OutRegs.insert(Reg);
		else
		for (auto OutReg : RegMap[std::make_pair(ElementWidth, Reg)])
		david-armUnsubmitted Not Done Reply Inline Actions What if the pair doesn't make sense? For example, {16, AArch64::ZAD3}. In this case OutRegs will be returned empty. david-arm: What if the pair doesn't make sense? For example, {16, AArch64::ZAD3}. In this case OutRegs…
		c-rhodesAuthorUnsubmitted Done Reply Inline Actions What if the pair doesn't make sense? For example, {16, AArch64::ZAD3}. In this case OutRegs will be returned empty. I've change the match register name function (`matchMatrixTileListRegName`) to only match tiles that are valid for tile lists, now we shouldn't end up here I've added an assert that the pair isn't empty. c-rhodes: > What if the pair doesn't make sense? For example, {16, AArch64::ZAD3}. In this case OutRegs…
		OutRegs.insert(OutReg);
		}

static std::unique_ptr<AArch64Operand> CreateImm(const MCExpr *Val, SMLoc S,		static std::unique_ptr<AArch64Operand> CreateImm(const MCExpr *Val, SMLoc S,
SMLoc E, MCContext &Ctx) {		SMLoc E, MCContext &Ctx) {
auto Op = std::make_unique<AArch64Operand>(k_Immediate, Ctx);		auto Op = std::make_unique<AArch64Operand>(k_Immediate, Ctx);
Op->Imm.Val = Val;		Op->Imm.Val = Val;
Op->StartLoc = S;		Op->StartLoc = S;
Op->EndLoc = E;		Op->EndLoc = E;
return Op;		return Op;
}		}
▲ Show 20 Lines • Show All 196 Lines • ▼ Show 20 Lines	case k_PSBHint:
OS << getPSBHintName();		OS << getPSBHintName();
break;		break;
case k_BTIHint:		case k_BTIHint:
OS << getBTIHintName();		OS << getBTIHintName();
break;		break;
case k_MatrixRegister:		case k_MatrixRegister:
OS << "<matrix " << getMatrixReg() << ">";		OS << "<matrix " << getMatrixReg() << ">";
break;		break;
		case k_MatrixTileList: {
		OS << "<matrixlist ";
		unsigned RegMask = getMatrixTileListRegMask();
		unsigned MaxBits = 8;
		david-armUnsubmitted Not Done Reply Inline Actions Given that you've now hard-coded properties about the maximum twice now (once above with RegMask <= 0xFF) and here with MaxBits = 8, is it worth having a constant declared somewhere that you can refer to? For example, MaxBits could be a constant in MatrixTileListOp and then RegMask = (1 << MaxBits ) - 1; david-arm: Given that you've now hard-coded properties about the maximum twice now (once above with…
		for (unsigned I = MaxBits; I > 0; --I)
		OS << ((RegMask & (1 << (I - 1))) >> (I - 1));
		david-armUnsubmitted Not Done Reply Inline Actions Do we need a separator here, like ' '? david-arm: Do we need a separator here, like ' '?
		c-rhodesAuthorUnsubmitted Done Reply Inline Actions Do we need a separator here, like ' '? I don't think so, this just emits the bits, e.g. `zero {za0.s, za2.s} -> <matrixlist 01010101` although I realise now the closing `>` is missing, I'll fix that. c-rhodes: > Do we need a separator here, like ' '? I don't think so, this just emits the bits, e.g.
		david-armUnsubmitted Not Done Reply Inline Actions nit: I think it's still missing a closing '>' david-arm: nit: I think it's still missing a closing '>'
		break;
		}
case k_Register:		case k_Register:
OS << "<register " << getReg() << ">";		OS << "<register " << getReg() << ">";
if (!getShiftExtendAmount() && !hasShiftExtendAmount())		if (!getShiftExtendAmount() && !hasShiftExtendAmount())
break;		break;
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case k_ShiftExtend:		case k_ShiftExtend:
OS << "<" << AArch64_AM::getShiftExtendName(getShiftExtendType()) << " #"		OS << "<" << AArch64_AM::getShiftExtendName(getShiftExtendType()) << " #"
<< getShiftExtendAmount();		<< getShiftExtendAmount();
▲ Show 20 Lines • Show All 1,478 Lines • ▼ Show 20 Lines	if (getParser().parseExpression(ImmVal))
return true;		return true;

if (HasELFModifier)		if (HasELFModifier)
ImmVal = AArch64MCExpr::create(ImmVal, RefKind, getContext());		ImmVal = AArch64MCExpr::create(ImmVal, RefKind, getContext());

return false;		return false;
}		}

		OperandMatchResultTy
		AArch64AsmParser::tryParseMatrixTileList(OperandVector &Operands) {
		MCAsmParser &Parser = getParser();

		if (!Parser.getTok().is(AsmToken::LCurly))
		return MatchOperand_NoMatch;

		auto ParseMatrixTile = [&Parser](unsigned &Reg, unsigned &ElementWidth) {
		std::string NameStr = Parser.getTok().getString().lower();
		StringRef Name = NameStr;
		size_t DotPosition = Name.find('.');
		if (DotPosition == StringRef::npos)
		return MatchOperand_NoMatch;

		unsigned RegNum = matchMatrixRegName(Name);
		david-armUnsubmitted Not Done Reply Inline Actions Is 'RegNum' guaranteed to always be > 0 for a valid register? david-arm: Is 'RegNum' guaranteed to always be > 0 for a valid register?
		c-rhodesAuthorUnsubmitted Not Done Reply Inline Actions Is 'RegNum' guaranteed to always be > 0 for a valid register? Yeah 0 is NoRegister, defined in `llvm/include/llvm/MC/MCRegister.h`: `static constexpr unsigned NoRegister = 0u;` I suppose the match register functions could default on NoRegister to be more explicit, but not sure how useful that is. c-rhodes: > Is 'RegNum' guaranteed to always be > 0 for a valid register? Yeah 0 is NoRegister, defined…
		if (!RegNum)
		return MatchOperand_NoMatch;

		StringRef Tail = Name.drop_front(DotPosition);
		const auto &KindRes = parseVectorKind(Tail, RegKind::Matrix);
		david-armUnsubmitted Done Reply Inline Actions To be honest it's quite hard to review this because I don't know what parseVectorKind returns. Can we specify the type instead of 'auto'? david-arm: To be honest it's quite hard to review this because I don't know what parseVectorKind returns.
		ElementWidth = KindRes->second;
		Reg = RegNum;
		Parser.Lex(); // Eat the register.
		return MatchOperand_Success;
		};

		SMLoc S = getLoc();
		auto LCurly = Parser.getTok();
		Parser.Lex(); // Eat left bracket token.

		SmallSet<unsigned, 8> DRegs;
		// Empty matrix list
		if (Parser.getTok().is(AsmToken::RCurly)) {
		Operands.push_back(AArch64Operand::CreateMatrixTileList(
		DRegs, /ElementWidth=/64, S, getLoc(), getContext()));
		Parser.Lex(); // Eat right bracket token.
		return MatchOperand_Success;
		}

		// Try parse {za} alias early
		auto RegTok = Parser.getTok();
		auto Reg = RegTok.getString().lower();
		if (Reg == "za") {
		Parser.Lex(); // Eat register

		if (parseToken(AsmToken::RCurly, "'}' expected"))
		return MatchOperand_ParseFail;

		AArch64Operand::ComputeRegsForAlias(AArch64::ZAB0, DRegs,
		/ElementWidth=/0);
		Operands.push_back(AArch64Operand::CreateMatrixTileList(
		DRegs, /ElementWidth=/64, S, getLoc(), getContext()));
		return MatchOperand_Success;
		}

		unsigned FirstReg, ElementWidth;
		auto ParseRes = ParseMatrixTile(FirstReg, ElementWidth);
		if (ParseRes != MatchOperand_Success) {
		Parser.getLexer().UnLex(LCurly);
		return ParseRes;
		}

		const MCRegisterInfo *RI = getContext().getRegisterInfo();

		unsigned PrevReg = FirstReg;
		unsigned Count = 1;

		AArch64Operand::ComputeRegsForAlias(FirstReg, DRegs, ElementWidth);

		while (parseOptionalToken(AsmToken::Comma)) {
		SMLoc Loc = getLoc();
		unsigned Reg, NextElementWidth;
		ParseRes = ParseMatrixTile(Reg, NextElementWidth);
		if (ParseRes != MatchOperand_Success)
		return ParseRes;

		// Element size must match on all regs in the list.
		if (ElementWidth != NextElementWidth) {
		Error(Loc, "mismatched register size suffix");
		bryanpkcUnsubmitted Not Done Reply Inline Actions Is this specified in the ISA documentation? I thought matrix tile lists of mixed element types were allowed. bryanpkc: Is this specified in the ISA documentation? I thought matrix tile lists of mixed element types…
		return MatchOperand_ParseFail;
		}

		// Registers must be sequential
		if (RI->getEncodingValue(Reg) <= (RI->getEncodingValue(PrevReg))) {
		Error(Loc, "registers must be sequential");
		bryanpkcUnsubmitted Not Done Reply Inline Actions Is this specified in the ISA documentation? It isn't obvious that the tiles in the list must be ordered. bryanpkc: Is this specified in the ISA documentation? It isn't obvious that the tiles in the list must be…
		c-rhodesAuthorUnsubmitted Not Done Reply Inline Actions Is this specified in the ISA documentation? It isn't obvious that the tiles in the list must be ordered. No, but I'm not sure there's a good reason for the tiles to not be in order? c-rhodes: > Is this specified in the ISA documentation? It isn't obvious that the tiles in the list must…
		bryanpkcUnsubmitted Not Done Reply Inline Actions IMO the assembler should be more forgiving here, e.g. when the user programmatically generates the assembly code and forgets to sort the registers. We may also want to be consistent with the ARM backend, which produces a warning, not an error, in a similar situation: .text foo.s:1:14: warning: register list not in ascending order stm sp!, {r5,r3,r0} ^ foo.s:1:17: warning: register list not in ascending order stm sp!, {r5,r3,r0} ^ stm sp!, {r0, r3, r5} bryanpkc: IMO the assembler should be more forgiving here, e.g. when the user programmatically generates…
		sdesmalenUnsubmitted Not Done Reply Inline Actions I agree that if the spec doesn't require the operands to be in a specific order, that the instruction should accept the operands in any order. sdesmalen: I agree that if the spec doesn't require the operands to be in a specific order, that the…
		c-rhodesAuthorUnsubmitted Done Reply Inline Actions IMO the assembler should be more forgiving here, e.g. when the user programmatically generates the assembly code and forgets to sort the registers. We may also want to be consistent with the ARM backend, which produces a warning, not an error, in a similar situation: .text foo.s:1:14: warning: register list not in ascending order stm sp!, {r5,r3,r0} ^ foo.s:1:17: warning: register list not in ascending order stm sp!, {r5,r3,r0} ^ stm sp!, {r0, r3, r5} Fair point, I've changed it to a warning. c-rhodes: > IMO the assembler should be more forgiving here, e.g. when the user programmatically…
		return MatchOperand_ParseFail;
		}

		AArch64Operand::ComputeRegsForAlias(Reg, DRegs, ElementWidth);

		PrevReg = Reg;
		++Count;
		}

		if (parseToken(AsmToken::RCurly, "'}' expected"))
		return MatchOperand_ParseFail;

		if (Count > 8) {
		Error(S, "invalid number of matrix registers");
		return MatchOperand_ParseFail;
		}

		Operands.push_back(AArch64Operand::CreateMatrixTileList(
		DRegs, ElementWidth, S, getLoc(), getContext()));

		return MatchOperand_Success;
		}

template <RegKind VectorKind>		template <RegKind VectorKind>
OperandMatchResultTy		OperandMatchResultTy
AArch64AsmParser::tryParseVectorList(OperandVector &Operands,		AArch64AsmParser::tryParseVectorList(OperandVector &Operands,
bool ExpectMatch) {		bool ExpectMatch) {
MCAsmParser &Parser = getParser();		MCAsmParser &Parser = getParser();
if (!Parser.getTok().is(AsmToken::LCurly))		if (!Parser.getTok().is(AsmToken::LCurly))
return MatchOperand_NoMatch;		return MatchOperand_NoMatch;

▲ Show 20 Lines • Show All 2,934 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp

Show First 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	static DecodeStatus DecodeZPR3RegisterClass(MCInst &Inst, unsigned RegNo,
uint64_t Address,		uint64_t Address,
const void *Decoder);		const void *Decoder);
static DecodeStatus DecodeZPR4RegisterClass(MCInst &Inst, unsigned RegNo,		static DecodeStatus DecodeZPR4RegisterClass(MCInst &Inst, unsigned RegNo,
uint64_t Address,		uint64_t Address,
const void *Decoder);		const void *Decoder);
template <unsigned NumBitsForTile>		template <unsigned NumBitsForTile>
static DecodeStatus DecodeMatrixTile(MCInst &Inst, unsigned RegNo,		static DecodeStatus DecodeMatrixTile(MCInst &Inst, unsigned RegNo,
uint64_t Address, const void *Decoder);		uint64_t Address, const void *Decoder);
		static DecodeStatus DecodeMatrixTileListRegisterClass(MCInst &Inst,
		unsigned RegMask,
		uint64_t Address,
		const void *Decoder);
static DecodeStatus DecodePPRRegisterClass(MCInst &Inst, unsigned RegNo,		static DecodeStatus DecodePPRRegisterClass(MCInst &Inst, unsigned RegNo,
uint64_t Address,		uint64_t Address,
const void *Decoder);		const void *Decoder);
static DecodeStatus DecodePPR_3bRegisterClass(MCInst &Inst, unsigned RegNo,		static DecodeStatus DecodePPR_3bRegisterClass(MCInst &Inst, unsigned RegNo,
uint64_t Address,		uint64_t Address,
const void *Decoder);		const void *Decoder);

static DecodeStatus DecodeFixedPointScaleImm32(MCInst &Inst, unsigned Imm,		static DecodeStatus DecodeFixedPointScaleImm32(MCInst &Inst, unsigned Imm,
▲ Show 20 Lines • Show All 563 Lines • ▼ Show 20 Lines	static DecodeStatus DecodeZPR4RegisterClass(MCInst &Inst, unsigned RegNo,
const void* Decoder) {		const void* Decoder) {
if (RegNo > 31)		if (RegNo > 31)
return Fail;		return Fail;
unsigned Register = ZZZZDecoderTable[RegNo];		unsigned Register = ZZZZDecoderTable[RegNo];
Inst.addOperand(MCOperand::createReg(Register));		Inst.addOperand(MCOperand::createReg(Register));
return Success;		return Success;
}		}

		static DecodeStatus DecodeMatrixTileListRegisterClass(MCInst &Inst,
		unsigned RegMask,
		uint64_t Address,
		const void *Decoder) {
		if (RegMask > 0xFF)
		david-armUnsubmitted Not Done Reply Inline Actions Again, another hard-coded value here. david-arm: Again, another hard-coded value here.
		c-rhodesAuthorUnsubmitted Not Done Reply Inline Actions Again, another hard-coded value here. not really sure where I could put a constant here c-rhodes: > Again, another hard-coded value here. not really sure where I could put a constant here
		return Fail;
		Inst.addOperand(MCOperand::createImm(RegMask));
		return Success;
		}

static const SmallVector<SmallVector<unsigned, 15>, 4> MatrixZATileDecoderTable = {		static const SmallVector<SmallVector<unsigned, 15>, 4> MatrixZATileDecoderTable = {
{AArch64::ZAB0},		{AArch64::ZAB0},
{AArch64::ZAH0, AArch64::ZAH1},		{AArch64::ZAH0, AArch64::ZAH1},
{AArch64::ZAS0, AArch64::ZAS1, AArch64::ZAS2, AArch64::ZAS3},		{AArch64::ZAS0, AArch64::ZAS1, AArch64::ZAS2, AArch64::ZAS3},
{AArch64::ZAD0, AArch64::ZAD1, AArch64::ZAD2, AArch64::ZAD3, AArch64::ZAD4,		{AArch64::ZAD0, AArch64::ZAD1, AArch64::ZAD2, AArch64::ZAD3, AArch64::ZAD4,
AArch64::ZAD5, AArch64::ZAD6, AArch64::ZAD7},		AArch64::ZAD5, AArch64::ZAD6, AArch64::ZAD7},
{AArch64::ZAQ0, AArch64::ZAQ1, AArch64::ZAQ2, AArch64::ZAQ3, AArch64::ZAQ4,		{AArch64::ZAQ0, AArch64::ZAQ1, AArch64::ZAQ2, AArch64::ZAQ3, AArch64::ZAQ4,
AArch64::ZAQ5, AArch64::ZAQ6, AArch64::ZAQ7, AArch64::ZAQ8, AArch64::ZAQ9,		AArch64::ZAQ5, AArch64::ZAQ6, AArch64::ZAQ7, AArch64::ZAQ8, AArch64::ZAQ9,
▲ Show 20 Lines • Show All 1,303 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.h

Show First 20 Lines • Show All 140 Lines • ▼ Show 20 Lines	protected:

void printFPImmOperand(const MCInst *MI, unsigned OpNum,		void printFPImmOperand(const MCInst *MI, unsigned OpNum,
const MCSubtargetInfo &STI, raw_ostream &O);		const MCSubtargetInfo &STI, raw_ostream &O);

void printVectorList(const MCInst *MI, unsigned OpNum,		void printVectorList(const MCInst *MI, unsigned OpNum,
const MCSubtargetInfo &STI, raw_ostream &O,		const MCSubtargetInfo &STI, raw_ostream &O,
StringRef LayoutSuffix);		StringRef LayoutSuffix);

		template <int EltSize>
		void printMatrixTileList(const MCInst *MI, unsigned OpNum,
		const MCSubtargetInfo &STI, raw_ostream &O);

/// Print a list of vector registers where the type suffix is implicit		/// Print a list of vector registers where the type suffix is implicit
/// (i.e. attached to the instruction rather than the registers).		/// (i.e. attached to the instruction rather than the registers).
void printImplicitlyTypedVectorList(const MCInst *MI, unsigned OpNum,		void printImplicitlyTypedVectorList(const MCInst *MI, unsigned OpNum,
const MCSubtargetInfo &STI,		const MCSubtargetInfo &STI,
raw_ostream &O);		raw_ostream &O);

template <unsigned NumLanes, char LaneKind>		template <unsigned NumLanes, char LaneKind>
void printTypedVectorList(const MCInst *MI, unsigned OpNum,		void printTypedVectorList(const MCInst *MI, unsigned OpNum,
▲ Show 20 Lines • Show All 86 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp

Show First 20 Lines • Show All 1,312 Lines • ▼ Show 20 Lines	void AArch64InstPrinter::printGPRSeqPairsClassOperand(const MCInst *MI,
unsigned Sube = (size == 32) ? AArch64::sube32 : AArch64::sube64;		unsigned Sube = (size == 32) ? AArch64::sube32 : AArch64::sube64;
unsigned Subo = (size == 32) ? AArch64::subo32 : AArch64::subo64;		unsigned Subo = (size == 32) ? AArch64::subo32 : AArch64::subo64;

unsigned Even = MRI.getSubReg(Reg, Sube);		unsigned Even = MRI.getSubReg(Reg, Sube);
unsigned Odd = MRI.getSubReg(Reg, Subo);		unsigned Odd = MRI.getSubReg(Reg, Subo);
O << getRegisterName(Even) << ", " << getRegisterName(Odd);		O << getRegisterName(Even) << ", " << getRegisterName(Odd);
}		}

		static const unsigned MatrixZADRegisterTable[] = {
		AArch64::ZAD0, AArch64::ZAD1, AArch64::ZAD2, AArch64::ZAD3,
		AArch64::ZAD4, AArch64::ZAD5, AArch64::ZAD6, AArch64::ZAD7
		};

		template <int EltSize>
		void AArch64InstPrinter::printMatrixTileList(const MCInst *MI, unsigned OpNum,
		const MCSubtargetInfo &STI,
		raw_ostream &O) {
		unsigned MaxRegs = 8;

		StringRef TypeSuffix;
		if (EltSize == 64)
		TypeSuffix = ".d";
		else
		llvm_unreachable("Unsupported element size");

		unsigned RegMask = MI->getOperand(OpNum).getImm();

		unsigned NumRegs = 0;
		for (unsigned I = 0; I < MaxRegs; ++I)
		if ((RegMask & (1 << I)) != 0)
		++NumRegs;

		O << "{";
		unsigned Printed = 0;
		for (unsigned I = 0; I < MaxRegs; ++I) {
		unsigned Reg = RegMask & (1 << I);
		if (Reg == 0)
		continue;
		O << getRegisterName(MatrixZADRegisterTable[I]);
		if (Printed + 1 != NumRegs)
		O << ", ";
		++Printed;
		}
		O << "}";
		}

void AArch64InstPrinter::printVectorList(const MCInst *MI, unsigned OpNum,		void AArch64InstPrinter::printVectorList(const MCInst *MI, unsigned OpNum,
const MCSubtargetInfo &STI,		const MCSubtargetInfo &STI,
raw_ostream &O,		raw_ostream &O,
StringRef LayoutSuffix) {		StringRef LayoutSuffix) {
unsigned Reg = MI->getOperand(OpNum).getReg();		unsigned Reg = MI->getOperand(OpNum).getReg();

O << "{ ";		O << "{ ";

▲ Show 20 Lines • Show All 375 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCCodeEmitter.cpp

Show First 20 Lines • Show All 180 Lines • ▼ Show 20 Lines	public:

template<int hasRs, int hasRt2> unsigned		template<int hasRs, int hasRt2> unsigned
fixLoadStoreExclusive(const MCInst &MI, unsigned EncodedValue,		fixLoadStoreExclusive(const MCInst &MI, unsigned EncodedValue,
const MCSubtargetInfo &STI) const;		const MCSubtargetInfo &STI) const;

unsigned fixOneOperandFPComparison(const MCInst &MI, unsigned EncodedValue,		unsigned fixOneOperandFPComparison(const MCInst &MI, unsigned EncodedValue,
const MCSubtargetInfo &STI) const;		const MCSubtargetInfo &STI) const;

		uint32_t EncodeMatrixTileListRegisterClass(const MCInst &MI, unsigned OpIdx,
		SmallVectorImpl<MCFixup> &Fixups,
		const MCSubtargetInfo &STI) const;
uint32_t EncodeMatrixIndexGPR32(const MCInst &MI, unsigned OpIdx,		uint32_t EncodeMatrixIndexGPR32(const MCInst &MI, unsigned OpIdx,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const;		const MCSubtargetInfo &STI) const;

private:		private:
FeatureBitset computeAvailableFeatures(const FeatureBitset &FB) const;		FeatureBitset computeAvailableFeatures(const FeatureBitset &FB) const;
void		void
verifyInstructionPredicates(const MCInst &MI,		verifyInstructionPredicates(const MCInst &MI,
▲ Show 20 Lines • Show All 318 Lines • ▼ Show 20 Lines
AArch64MCCodeEmitter::getVecShiftL8OpValue(const MCInst &MI, unsigned OpIdx,		AArch64MCCodeEmitter::getVecShiftL8OpValue(const MCInst &MI, unsigned OpIdx,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const {		const MCSubtargetInfo &STI) const {
const MCOperand &MO = MI.getOperand(OpIdx);		const MCOperand &MO = MI.getOperand(OpIdx);
assert(MO.isImm() && "Expected an immediate value for the scale amount!");		assert(MO.isImm() && "Expected an immediate value for the scale amount!");
return MO.getImm() - 8;		return MO.getImm() - 8;
}		}

		uint32_t AArch64MCCodeEmitter::EncodeMatrixTileListRegisterClass(
		const MCInst &MI, unsigned OpIdx, SmallVectorImpl<MCFixup> &Fixups,
		const MCSubtargetInfo &STI) const {
		unsigned RegMask = MI.getOperand(OpIdx).getImm();
		assert(RegMask <= 0xFF && "Invalid register mask!");
		return RegMask;
		}

uint32_t AArch64MCCodeEmitter::EncodeMatrixIndexGPR32(		uint32_t AArch64MCCodeEmitter::EncodeMatrixIndexGPR32(
const MCInst &MI, unsigned OpIdx, SmallVectorImpl<MCFixup> &Fixups,		const MCInst &MI, unsigned OpIdx, SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const {		const MCSubtargetInfo &STI) const {
auto RegOpnd = MI.getOperand(OpIdx).getReg();		auto RegOpnd = MI.getOperand(OpIdx).getReg();
return RegOpnd - AArch64::W12;		return RegOpnd - AArch64::W12;
}		}

uint32_t		uint32_t
▲ Show 20 Lines • Show All 139 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/SMEInstrFormats.td

Show First 20 Lines • Show All 647 Lines • ▼ Show 20 Lines	def : InstAlias<"mov\t$Zd, $Pg/m, $ZAn[$Rv]",
MatrixIndexGPR32Op12_15:$Rv), 1>;		MatrixIndexGPR32Op12_15:$Rv), 1>;
}		}

multiclass sme_tile_to_vector<string mnemonic> {		multiclass sme_tile_to_vector<string mnemonic> {
defm _H : sme_tile_to_vector_v<mnemonic, /is_col=/0b0>;		defm _H : sme_tile_to_vector_v<mnemonic, /is_col=/0b0>;
defm _V : sme_tile_to_vector_v<mnemonic, /is_col=/0b1>;		defm _V : sme_tile_to_vector_v<mnemonic, /is_col=/0b1>;
}		}

		//===----------------------------------------------------------------------===//
		// SME Zero
		//===----------------------------------------------------------------------===//

		class sme_zero_inst<string mnemonic>
		: I<(outs MatrixTileList64:$imm), (ins),
		bryanpkcUnsubmitted Not Done Reply Inline Actions Wouldn't it be better for the instruction to accept an output `RegisterOperand` that covers all possible tile types, which would allow register allocation to assign the appropriate tile registers? Doing so would also allow us to distinguish between `zero { za0.d, za4.d }` and `zero { za0.s }` when we need to parse assembly code back into machine IR with correct register semantics. bryanpkc: Wouldn't it be better for the instruction to accept an output `RegisterOperand` that covers all…
		c-rhodesAuthorUnsubmitted Not Done Reply Inline Actions Wouldn't it be better for the instruction to accept an output `RegisterOperand` that covers all possible tile types, which would allow register allocation to assign the appropriate tile registers? Doing so would also allow us to distinguish between `zero { za0.d, za4.d }` and `zero { za0.s }` when we need to parse assembly code back into machine IR with correct register semantics. I've not given that a great deal of thought yet to be honest, our focus is on MC layer support at the moment, may revisit this in the future. c-rhodes: > Wouldn't it be better for the instruction to accept an output `RegisterOperand` that covers…
		bryanpkcUnsubmitted Not Done Reply Inline Actions We also went with an immediate operand at first, but eventually replaced it with a register operand, mainly to allow register allocation to work. bryanpkc: We also went with an immediate operand at first, but eventually replaced it with a register…
		c-rhodesAuthorUnsubmitted Not Done Reply Inline Actions We also went with an immediate operand at first, but eventually replaced it with a register operand, mainly to allow register allocation to work. Your approach sounds more complete, I'd be interested in taking a look, are you able to upstream it? c-rhodes: > We also went with an immediate operand at first, but eventually replaced it with a register…
		sdesmalenUnsubmitted Not Done Reply Inline Actions @bryanpkc you make a good point and it would be interested to see those patches! For this patch I think that unless the changes you're suggesting are trivial, it would make sense to have any changes that are not required for the assembler as follow-up patches. I'm a bit cautious about this otherwise holding up SME asm support into LLVM 13, since those changes aren't necessarily required for the assembler. sdesmalen: @bryanpkc you make a good point and it would be interested to see those patches! For this…
		mnemonic, "\t$imm", "", []>, Sched<[]> {
		bits<8> imm;
		let Inst{31-8} = 0b110000000000100000000000;
		let Inst{7-0} = imm;
		}

		multiclass sme_zero<string mnemonic> {
		def NAME : sme_zero_inst<mnemonic>;

		def : InstAlias<"zero\t$imm",
		(!cast<Instruction>(NAME) MatrixTileList8:$imm), 0>;
		def : InstAlias<"zero\t$imm",
		(!cast<Instruction>(NAME) MatrixTileList16:$imm), 0>;
		def : InstAlias<"zero\t$imm",
		(!cast<Instruction>(NAME) MatrixTileList32:$imm), 0>;

		def : InstAlias<"zero\t\\{za\\}", (!cast<Instruction>(NAME) 0b11111111), 1>;
		def : InstAlias<"zero\t\\{za0.h\\}", (!cast<Instruction>(NAME) 0b01010101), 1>;
		def : InstAlias<"zero\t\\{za1.h\\}", (!cast<Instruction>(NAME) 0b10101010), 1>;
		def : InstAlias<"zero\t\\{za0.s\\}", (!cast<Instruction>(NAME) 0b00010001), 1>;
		def : InstAlias<"zero\t\\{za1.s\\}", (!cast<Instruction>(NAME) 0b00100010), 1>;
		def : InstAlias<"zero\t\\{za2.s\\}", (!cast<Instruction>(NAME) 0b01000100), 1>;
		def : InstAlias<"zero\t\\{za3.s\\}", (!cast<Instruction>(NAME) 0b10001000), 1>;
		def : InstAlias<"zero\t\\{za0.s,za1.s\\}", (!cast<Instruction>(NAME) 0b00110011), 1>;
		def : InstAlias<"zero\t\\{za0.s,za3.s\\}", (!cast<Instruction>(NAME) 0b10011001), 1>;
		def : InstAlias<"zero\t\\{za1.s,za2.s\\}", (!cast<Instruction>(NAME) 0b01100110), 1>;
		def : InstAlias<"zero\t\\{za2.s,za3.s\\}", (!cast<Instruction>(NAME) 0b11001100), 1>;
		david-armUnsubmitted Not Done Reply Inline Actions Are we missing ones for {za1.s,za3.s} and {z0.s,za2.s}? david-arm: Are we missing ones for {za1.s,za3.s} and {z0.s,za2.s}?
		bryanpkcUnsubmitted Not Done Reply Inline Actions Why are these implemented as `InstAlias`? Can they not be parsed by `AArch64AsmParser::tryParseMatrixTileList`? bryanpkc: Why are these implemented as `InstAlias`? Can they not be parsed by `AArch64AsmParser…
		c-rhodesAuthorUnsubmitted Not Done Reply Inline Actions Why are these implemented as `InstAlias`? Can they not be parsed by `AArch64AsmParser::tryParseMatrixTileList`? See my comment below regarding aliases. To expand on the parsing a bit, `tryParseMatrixTileList` will parse tile lists with 8/16/32 or 64-bit element types, the non 64-bit types are treated as aliases and get converted to .D tiles, then encoded as an 8-bit mask. c-rhodes: > Why are these implemented as `InstAlias`? Can they not be parsed by `AArch64AsmParser…
		c-rhodesAuthorUnsubmitted Not Done Reply Inline Actions Are we missing ones for {za1.s,za3.s} and {z0.s,za2.s}? These aliases are for the preferred disassembly, the constraint is the instruction uses the shortest list of tile names that represent the encoded immediate mask. The parsed tile list gets converted (not necessary if input is .D tiles) to 64-bit tiles and then encoded as an 8-bit mask, a bit for each tile (za0.d-za7.d). For `{za1.s,za3.s}`, the mapping is: za1.s -> {za1.d, za5.d} za3.s -> {za3.d, za7.d} -> {za1.d, za3.d, za5.d, za7.d} (mask: 10101010) the shortest possible tile list for this mask is `{za1.h}` and that's defined above. Same principle applies for `{z0.s,za2.s}`. For reference, the following table describes all possible aliases: The following table describes all possible aliases: mask in preferred 0 0b11111111 zero {za} zero {za} 1 0b11111111 zero {za0.b} zero {za} 2 0b01010101 zero {za0.h} zero {za0.h} 3 0b10101010 zero {za1.h} zero {za1.h} 4 0b11111111 zero {za0.h,za1.h} zero {za} 5 0b00010001 zero {za0.s} zero {za0.s} 6 0b00100010 zero {za1.s} zero {za1.s} 7 0b01000100 zero {za2.s} zero {za2.s} 8 0b10001000 zero {za3.s} zero {za3.s} 9 0b00110011 zero {za0.s,za1.s} zero {za0.s,za1.s} 10 0b01010101 zero {za0.s,za2.s} zero {za0.h} 11 0b10011001 zero {za0.s,za3.s} zero {za0.s,za3.s} 12 0b01100110 zero {za1.s,za2.s} zero {za1.s,za2.s} 13 0b10101010 zero {za1.s,za3.s} zero {za1.h} 14 0b11001100 zero {za2.s,za3.s} zero {za2.s,za3.s} 15 0b01110111 zero {za0.s,za1.s,za2.s} zero {za0.s,za1.s,za2.s} 16 0b10111011 zero {za0.s,za1.s,za3.s} zero {za0.s,za1.s,za3.s} 17 0b11011101 zero {za0.s,za2.s,za3.s} zero {za0.s,za2.s,za3.s} 18 0b11101110 zero {za1.s,za2.s,za3.s} zero {za1.s,za2.s,za3.s} 19 0b11111111 zero {za0.s,za1.s,za2.s,za3.s} zero {za} 20 0b11111111 zero {za0.d,za1.d,za2.d,za3.d,za4.d,za5.d,za6.... zero {za} 21 0b01010101 zero {za0.d,za2.d,za4.d,za6.d} zero {za0.h} 22 0b10101010 zero {za1.d,za3.d,za5.d,za7.d} zero {za1.h} 23 0b00010001 zero {za0.d,za4.d} zero {za0.s} 24 0b00100010 zero {za1.d,za5.d} zero {za1.s} 25 0b01000100 zero {za2.d,za6.d} zero {za2.s} 26 0b10001000 zero {za3.d,za7.d} zero {za3.s} 27 0b00110011 zero {za0.d,za1.d,za4.d,za5.d} zero {za0.s,za1.s} 28 0b10011001 zero {za0.d,za3.d,za4.d,za7.d} zero {za0.s,za3.s} 29 0b01100110 zero {za1.d,za2.d,za5.d,za6.d} zero {za1.s,za2.s} 30 0b11001100 zero {za2.d,za3.d,za6.d,za7.d} zero {za2.s,za3.s} 31 0b01110111 zero {za0.d,za1.d,za2.d,za4.d,za5.d,za6.d} zero {za0.s,za1.s,za2.s} 32 0b10111011 zero {za0.d,za1.d,za3.d,za4.d,za5.d,za7.d} zero {za0.s,za1.s,za3.s} 33 0b11011101 zero {za0.d,za2.d,za3.d,za4.d,za6.d,za7.d} zero {za0.s,za2.s,za3.s} 34 0b11101110 zero {za1.d,za2.d,za3.d,za5.d,za6.d,za7.d} zero {za1.s,za2.s,za3.s} c-rhodes: > Are we missing ones for {za1.s,za3.s} and {z0.s,za2.s}? These aliases are for the preferred…
		def : InstAlias<"zero\t\\{za0.s,za1.s,za2.s\\}", (!cast<Instruction>(NAME) 0b01110111), 1>;
		def : InstAlias<"zero\t\\{za0.s,za1.s,za3.s\\}", (!cast<Instruction>(NAME) 0b10111011), 1>;
		def : InstAlias<"zero\t\\{za0.s,za2.s,za3.s\\}", (!cast<Instruction>(NAME) 0b11011101), 1>;
		def : InstAlias<"zero\t\\{za1.s,za2.s,za3.s\\}", (!cast<Instruction>(NAME) 0b11101110), 1>;
		}

llvm/test/MC/AArch64/SME/zero-diagnostics.s

This file was added.

				// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme 2>&1 < %s\| FileCheck %s

				// --------------------------------------------------------------------------//
				// Registers must be sequential

				zero {za0.d, za0.d}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: registers must be sequential
				// CHECK-NEXT: zero {za0.d, za0.d}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

				zero {za1.d, za0.d}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: registers must be sequential
				// CHECK-NEXT: zero {za1.d, za0.d}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

				zero {za0.d, za1.d, za2.d, za1.d}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: registers must be sequential
				// CHECK-NEXT: zero {za0.d, za1.d, za2.d, za1.d}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

				zero {za0.d, za1.d, za2.d, za2.d}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: registers must be sequential
				// CHECK-NEXT: zero {za0.d, za1.d, za2.d, za2.d}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

				zero {za7.d, za6.d}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: registers must be sequential
				// CHECK-NEXT: zero {za7.d, za6.d}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

				// --------------------------------------------------------------------------//
				// Mismatched register size suffix

				zero {za0.b, za5.d}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: mismatched register size suffix
				// CHECK-NEXT: zero {za0.b, za5.d}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

				// --------------------------------------------------------------------------//
				// Invalid element width

				zero {za, za}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: '}' expected
				// CHECK-NEXT: zero {za, za}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

				// --------------------------------------------------------------------------//
				// Invalid matrix tile

				zero {za0.b, za1.b}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
				// CHECK-NEXT: zero {za0.b, za1.b}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

				zero {za2.h}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
				// CHECK-NEXT: zero {za2.h}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

				zero {za0.s, za1.s, za2.s, za3.s, za4.s}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
				// CHECK-NEXT: zero {za0.s, za1.s, za2.s, za3.s, za4.s}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

				// --------------------------------------------------------------------------//
				// Invalid matrix list operand

				zero {za0.d, za1.d, za2.d, za3.d, za4.d, za5.d, za6.d, za7.d, za8.d}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
				// CHECK-NEXT: zero {za0.d, za1.d, za2.d, za3.d, za4.d, za5.d, za6.d, za7.d, za8.d}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

				zero {za0.q, za1.q, za2.q, za3.q, za4.q, za5.q, za6.q, za7.q, za8.q, za9.q, za10.q, za11.q, za12.q, za13.q, za14.q, za15.q}
				// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid number of matrix registers
				// CHECK-NEXT: zero {za0.q, za1.q, za2.q, za3.q, za4.q, za5.q, za6.q, za7.q, za8.q, za9.q, za10.q, za11.q, za12.q, za13.q, za14.q, za15.q}
				// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:

llvm/test/MC/AArch64/SME/zero.s

This file was added.

				// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme < %s \
				// RUN: \| FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
				// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
				// RUN: \| FileCheck %s --check-prefix=CHECK-ERROR
				// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme < %s \
				// RUN: \| llvm-objdump -d --mattr=+sme - \| FileCheck %s --check-prefix=CHECK-INST
				// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme < %s \
				// RUN: \| llvm-objdump -d - \| FileCheck %s --check-prefix=CHECK-UNKNOWN
				// Disassemble encoding and check the re-encoding (-show-encoding) matches.
				// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme < %s \
				// RUN: \| sed '/\.text$/d' \| sed 's/.*encoding:\s//g' \
				// RUN: \| llvm-mc -triple=aarch64 -mattr=+sme -disassemble -show-encoding \
				// RUN: \| FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST

				zero {}
				// CHECK-INST: zero {}
				// CHECK-ENCODING: [0x00,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 00 00 08 c0 <unknown>

				zero {za0.d, za2.d, za4.d, za6.d}
				// CHECK-INST: zero {za0.h}
				// CHECK-ENCODING: [0x55,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 55 00 08 c0 <unknown>

				zero {za0.d, za1.d, za2.d, za4.d, za5.d, za7.d}
				// CHECK-INST: zero {za0.d, za1.d, za2.d, za4.d, za5.d, za7.d}
				// CHECK-ENCODING: [0xb7,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: b7 00 08 c0 <unknown>

				zero {za0.d, za1.d, za2.d, za3.d, za4.d, za5.d, za6.d, za7.d}
				// CHECK-INST: zero {za}
				// CHECK-ENCODING: [0xff,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: ff 00 08 c0 <unknown>

				// --------------------------------------------------------------------------//
				// Aliases

				zero {za}
				// CHECK-INST: zero {za}
				// CHECK-ENCODING: [0xff,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: ff 00 08 c0 <unknown>

				zero {za0.b}
				// CHECK-INST: zero {za}
				// CHECK-ENCODING: [0xff,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: ff 00 08 c0 <unknown>

				zero {za0.h}
				// CHECK-INST: zero {za0.h}
				// CHECK-ENCODING: [0x55,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 55 00 08 c0 <unknown>

				zero {za1.h}
				// CHECK-INST: zero {za1.h}
				// CHECK-ENCODING: [0xaa,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: aa 00 08 c0 <unknown>

				zero {za0.h,za1.h}
				// CHECK-INST: zero {za}
				// CHECK-ENCODING: [0xff,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: ff 00 08 c0 <unknown>

				zero {za0.s}
				// CHECK-INST: zero {za0.s}
				// CHECK-ENCODING: [0x11,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 11 00 08 c0 <unknown>

				zero {za1.s}
				// CHECK-INST: zero {za1.s}
				// CHECK-ENCODING: [0x22,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 22 00 08 c0 <unknown>

				zero {za2.s}
				// CHECK-INST: zero {za2.s}
				// CHECK-ENCODING: [0x44,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 44 00 08 c0 <unknown>

				zero {za3.s}
				// CHECK-INST: zero {za3.s}
				// CHECK-ENCODING: [0x88,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 88 00 08 c0 <unknown>

				zero {za0.s,za1.s}
				// CHECK-INST: zero {za0.s,za1.s}
				// CHECK-ENCODING: [0x33,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 33 00 08 c0 <unknown>

				zero {za0.s,za2.s}
				// CHECK-INST: zero {za0.h}
				// CHECK-ENCODING: [0x55,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 55 00 08 c0 <unknown>

				zero {za0.s,za3.s}
				// CHECK-INST: zero {za0.s,za3.s}
				// CHECK-ENCODING: [0x99,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 99 00 08 c0 <unknown>

				zero {za1.s,za2.s}
				// CHECK-INST: zero {za1.s,za2.s}
				// CHECK-ENCODING: [0x66,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 66 00 08 c0 <unknown>

				zero {za1.s,za3.s}
				// CHECK-INST: zero {za1.h}
				// CHECK-ENCODING: [0xaa,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: aa 00 08 c0 <unknown>

				zero {za2.s,za3.s}
				// CHECK-INST: zero {za2.s,za3.s}
				// CHECK-ENCODING: [0xcc,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: cc 00 08 c0 <unknown>

				zero {za0.s,za1.s,za2.s}
				// CHECK-INST: zero {za0.s,za1.s,za2.s}
				// CHECK-ENCODING: [0x77,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 77 00 08 c0 <unknown>

				zero {za0.s,za1.s,za3.s}
				// CHECK-INST: zero {za0.s,za1.s,za3.s}
				// CHECK-ENCODING: [0xbb,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: bb 00 08 c0 <unknown>

				zero {za0.s,za2.s,za3.s}
				// CHECK-INST: zero {za0.s,za2.s,za3.s}
				// CHECK-ENCODING: [0xdd,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: dd 00 08 c0 <unknown>

				zero {za1.s,za2.s,za3.s}
				// CHECK-INST: zero {za1.s,za2.s,za3.s}
				// CHECK-ENCODING: [0xee,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: ee 00 08 c0 <unknown>

				zero {za0.s,za1.s,za2.s,za3.s}
				// CHECK-INST: zero {za}
				// CHECK-ENCODING: [0xff,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: ff 00 08 c0 <unknown>

				zero {za0.d,za1.d,za2.d,za3.d,za4.d,za5.d,za6.d,za7.d}
				// CHECK-INST: zero {za}
				// CHECK-ENCODING: [0xff,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: ff 00 08 c0 <unknown>

				zero {za0.d,za2.d,za4.d,za6.d}
				// CHECK-INST: zero {za0.h}
				// CHECK-ENCODING: [0x55,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 55 00 08 c0 <unknown>

				zero {za1.d,za3.d,za5.d,za7.d}
				// CHECK-INST: zero {za1.h}
				// CHECK-ENCODING: [0xaa,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: aa 00 08 c0 <unknown>

				zero {za0.d,za4.d}
				// CHECK-INST: zero {za0.s}
				// CHECK-ENCODING: [0x11,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 11 00 08 c0 <unknown>

				zero {za1.d,za5.d}
				// CHECK-INST: zero {za1.s}
				// CHECK-ENCODING: [0x22,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 22 00 08 c0 <unknown>

				zero {za2.d,za6.d}
				// CHECK-INST: zero {za2.s}
				// CHECK-ENCODING: [0x44,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 44 00 08 c0 <unknown>

				zero {za3.d,za7.d}
				// CHECK-INST: zero {za3.s}
				// CHECK-ENCODING: [0x88,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 88 00 08 c0 <unknown>

				zero {za0.d,za1.d,za4.d,za5.d}
				// CHECK-INST: zero {za0.s,za1.s}
				// CHECK-ENCODING: [0x33,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 33 00 08 c0 <unknown>

				zero {za0.d,za3.d,za4.d,za7.d}
				// CHECK-INST: zero {za0.s,za3.s}
				// CHECK-ENCODING: [0x99,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 99 00 08 c0 <unknown>

				zero {za1.d,za2.d,za5.d,za6.d}
				// CHECK-INST: zero {za1.s,za2.s}
				// CHECK-ENCODING: [0x66,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 66 00 08 c0 <unknown>

				zero {za2.d,za3.d,za6.d,za7.d}
				// CHECK-INST: zero {za2.s,za3.s}
				// CHECK-ENCODING: [0xcc,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: cc 00 08 c0 <unknown>

				zero {za0.d,za1.d,za2.d,za4.d,za5.d,za6.d}
				// CHECK-INST: zero {za0.s,za1.s,za2.s}
				// CHECK-ENCODING: [0x77,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: 77 00 08 c0 <unknown>

				zero {za0.d,za1.d,za3.d,za4.d,za5.d,za7.d}
				// CHECK-INST: zero {za0.s,za1.s,za3.s}
				// CHECK-ENCODING: [0xbb,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: bb 00 08 c0 <unknown>

				zero {za0.d,za2.d,za3.d,za4.d,za6.d,za7.d}
				// CHECK-INST: zero {za0.s,za2.s,za3.s}
				// CHECK-ENCODING: [0xdd,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: dd 00 08 c0 <unknown>

				zero {za1.d,za2.d,za3.d,za5.d,za6.d,za7.d}
				// CHECK-INST: zero {za1.s,za2.s,za3.s}
				// CHECK-ENCODING: [0xee,0x00,0x08,0xc0]
				// CHECK-ERROR: instruction requires: sme
				// CHECK-UNKNOWN: ee 00 08 c0 <unknown>