This is an archive of the discontinued LLVM Phabricator instance.

[MC layer][AArch64] llvm-mc accepts 4-bit immediate values for "msr pan, #imm"
ClosedPublic

Authored by labrinea on Sep 21 2015, 5:13 AM.

Download Raw Diff

Details

Reviewers

rengolin
grosbach
t.p.northover

Commits

rG1bab191f25ac: [MC layer][AArch64] llvm-mc accepts 4-bit immediate values for "msr pan, #imm"…
rL249313: [MC layer][AArch64] llvm-mc accepts 4-bit immediate values for

Summary

llvm-mc incorrectly encodes and decodes "msr pan, #imm" instructions. Only 1-bit immediate values should be valid. Currently 4-bit immediates are accepted.

echo "0x9f 0x42 0x00 0xd5" | llvm-mc -disassemble -triple=aarch64 -mattr=+v8.1a
        .text
        msr     PAN, #2
$ echo "msr pan, #2" | llvm-mc -triple=aarch64 -mattr=+v8.1a
        .text
        msr     PAN, #2

Diff Detail

Event Timeline

labrinea updated this revision to Diff 35231.Sep 21 2015, 5:13 AM

labrinea retitled this revision from to [MC layer][AArch64] llvm-mc accepts 4-bit immediate values for "msr pan, #imm".

labrinea updated this object.

labrinea added reviewers: grosbach, t.p.northover.

Herald added subscribers: rengolin, aemerson. · View Herald TranscriptSep 21 2015, 5:13 AM

Hi Alexandros,

Is this really an error? The MSR immediate is a 4-bit field. I understand why privileged access only needs a boolean, but if the CPU ignores the other three bits, we could maybe safely ignore it? Just a guess....

cheers,
--renato

As you can see from the above commands, the MC layer does not give any errors or warnings when encoding an assembly string. Furthermore it incorrectly decodes a byte stream.

In D13011#249979, @labrinea wrote:

As you can see from the above commands, the MC layer does not give any errors or warnings when encoding an assembly string. Furthermore it incorrectly decodes a byte stream.

No, I mean, in the CPU. Would that trigger a trap if the immediate is not 000x?

The MSR instruction is explicitly documented as taking a 4-bit immediate operand.

You're arguing that the assembler should contextually decide what is or isn't legal for the second operand based on the value of the first? That's generally not something we do for these sorts of instructions unless there is documentation requiring it.

In D13011#249988, @grosbach wrote:

You're arguing that the assembler should contextually decide what is or isn't legal for the second operand based on the value of the first? That's generally not something we do for these sorts of instructions unless there is documentation requiring it.

I second that. Until we have an update on the ARM ARM stating that PAM's immediate has to be 1 bit encoded and/or we have hardware to show that it traps, we shouldn't be changing this.

This revision now requires changes to proceed.Sep 22 2015, 5:12 PM

Until we have an update on the ARM ARM stating that PAM's immediate has to be 1 bit encoded

The ARM ARMv8.1 (which hasn't been published yet, but LLVM's implementation of ARMv8.1 is based on it) lists all encodings of "MSR PAN" with immediates > 1 as UNPREDICTABLE, meaning that the processor may do any of these:

Trap
Treat the instruction as a NOP
Ignore the high bits of the immediate
Write an arbitrary value to the destination register

Shall we reconsider keeping this patch after Oliver's feedback?

In D13011#254532, @olista01 wrote:

The ARM ARMv8.1 (which hasn't been published yet, but LLVM's implementation of ARMv8.1 is based on it) lists all encodings of "MSR PAN" with immediates > 1 as UNPREDICTABLE

When is it going to be public?

I'm not doubting you, but I don't want to set the precedent to implement things that aren't public. Though, as you say, all ARMv8.1 is based on that document, so I'm not sure what to choose.

Personally, I'm ok with this going in, as long as we're explicit that this is v8.1 only (even if PAN is not available before, not everyone knows that). Jim, are you ok with this, too?

Also, please pass clang-format on your change.

cheers,
--renato

lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
3929	Include the fact that this is armv8.1 only in the comment.
3946	bad format, please pass clang-format on this.

I don't see the text which describes the other values as UNPREDICTABLE. Can you provide a doc number and section reference? The v8.1 extension doc describes the other bits of the CRm field as being zero, but I don't see a mention of what happens if they're not. I believe you, I'm just wondering where I should be looking that I'm obviously not.

That aside, is there perhaps a cleaner way to implement this? Perhaps via a generic instruction definition that allows all encoding values and then assembly aliases that have more restricted operand sets? If we can avoid it, I'd really like to stay away from more C++ fiddling in the parser. We already have way too much of it. :(

Can you provide a doc number and section reference?

The v8.1 spec, section 8.1.3, lists the top 3 bits of the CRm field as "(0)(0)(0)". The v8A ARMARM, section C2.2.1 defines that the behaviour is CONSTRAINED UNPREDICTABLE if any of these bits are 1, and lists the allowed behaviours.

Reimplemented via a generic instruction definition that allows all encoding values and then assembly aliases that have more restricted operand sets.

Hi Alexandros,

With the nitpick, looks good to me. Thanks!

lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
2268 ↗	(On Diff #36340)	Nitpick: I'd extract this on the line above, like: auto State = Immed < 2 ? AArch64::MSRpstateImm1 : AArch64::MSRpstateImm4; return CurDAG->getMachineNode(State, DL, ...

This revision is now accepted and ready to land.Oct 3 2015, 7:32 AM

labrinea added inline comments.Oct 4 2015, 12:35 PM

lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
2268 ↗	(On Diff #36340)	Sure, thanks!

rengolin added inline comments.Oct 4 2015, 12:38 PM

lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
2268 ↗	(On Diff #36340)	Wait, wouldn't this make it use the wrong imm encoding for other calls if their values happened to be < 2?

labrinea added inline comments.Oct 5 2015, 3:07 AM

lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
2268 ↗	(On Diff #36340)	Good point the check should be based on the value of Reg, and MVT::i16 should be changed to MVT::i1 too for AArch64::MSRpstateImm1 opcode. I am updating the patch.

labrinea added inline comments.Oct 5 2015, 3:34 AM

lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
2268 ↗	(On Diff #36340)	Sorry I got confused, I thought MVT::i16 means values from 0 to 16 rather than 16 bits. I think MVT enumeration doesn't really matter since MVT::i16 was already more than the 4 bits required for the Immediate. I think we should just change the patch like this: auto State = (Reg == AArch64PState::PAN) ? AArch64::MSRpstateImm1 : AArch64::MSRpstateImm4;

rengolin added inline comments.Oct 5 2015, 3:38 AM

lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
2268 ↗	(On Diff #36340)	Yes, that's what I was thinking, too. Maybe also adding an assert? if (Reg == AArch64PState::PAN) assert(Immed < 2 && "Bad imm");

AArch64DAGToDAGISel::SelectWriteRegister updated.

With the nitpick, LGTM. Thanks!

lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
2272 ↗	(On Diff #36499)	nitpick: no new line here. :)

Closed by commit rL249313: [MC layer][AArch64] llvm-mc accepts 4-bit immediate values for (authored by alelab01). · Explain WhyOct 5 2015, 6:44 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Target/

AArch64/

AsmParser/

AArch64AsmParser.cpp

28 lines

Disassembler/

AArch64Disassembler.cpp

3 lines

test/

MC/

AArch64/

armv8.1a-pan.s

10 lines

Disassembler/

AArch64/

armv8.1a-pan.txt

2 lines

Diff 35231

lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp

Show First 20 Lines • Show All 377 Lines • ▼ Show 20 Lines	unsigned getVectorIndex() const {
return VectorIndex.Val;		return VectorIndex.Val;
}		}

StringRef getSysReg() const {		StringRef getSysReg() const {
assert(Kind == k_SysReg && "Invalid access!");		assert(Kind == k_SysReg && "Invalid access!");
return StringRef(SysReg.Data, SysReg.Length);		return StringRef(SysReg.Data, SysReg.Length);
}		}

		uint32_t getSysPStateField() const {
		assert(Kind == k_SysReg && "Invalid access!");
		return SysReg.PStateField;
		}

unsigned getSysCR() const {		unsigned getSysCR() const {
assert(Kind == k_SysCR && "Invalid access!");		assert(Kind == k_SysCR && "Invalid access!");
return SysCRImm.Val;		return SysCRImm.Val;
}		}

unsigned getPrefetch() const {		unsigned getPrefetch() const {
assert(Kind == k_Prefetch && "Invalid access!");		assert(Kind == k_Prefetch && "Invalid access!");
return Prefetch.Val;		return Prefetch.Val;
▲ Show 20 Lines • Show All 3,522 Lines • ▼ Show 20 Lines	if (RegOp.isReg() && ImmOp.isFPImm() && ImmOp.getFPImm() == (unsigned)-1) {
RegOp.getReg())		RegOp.getReg())
? AArch64::WZR		? AArch64::WZR
: AArch64::XZR;		: AArch64::XZR;
Operands[2] = AArch64Operand::CreateReg(zreg, false, Op.getStartLoc(),		Operands[2] = AArch64Operand::CreateReg(zreg, false, Op.getStartLoc(),
Op.getEndLoc(), getContext());		Op.getEndLoc(), getContext());
}		}
}		}

		// Hack to handle MSR Pstate expecting 1-bit Imm
		rengolinUnsubmitted Not Done Reply Inline Actions Include the fact that this is armv8.1 only in the comment. rengolin: Include the fact that this is armv8.1 only in the comment.
		if (NumOperands == 3 && Tok == "msr") {

		AArch64Operand &Op1 = static_cast<AArch64Operand &>(*Operands[1]);
		AArch64Operand &Op2 = static_cast<AArch64Operand &>(*Operands[2]);

		if (Op1.isSystemPStateField() &&
		Op1.getSysPStateField() == AArch64PState::PAN) {

		// should be 1-bit Imm, or GPR64
		if (Op2.isImm()) {

		const MCConstantExpr *MCE = dyn_cast<MCConstantExpr>(Op2.getImm());

		if (!MCE \|\| MCE->getValue() < 0 \|\| MCE->getValue() > 1)
		return Error(Op2.getStartLoc(),
		"immediate must be an integer in range [0, 1].");
		}
		rengolinUnsubmitted Not Done Reply Inline Actions bad format, please pass clang-format on this. rengolin: bad format, please pass clang-format on this.
		else if (!Op2.isGPR32as64())
		return showMatchError(Op2.getStartLoc(), Match_InvalidOperand);
		}
		}

MCInst Inst;		MCInst Inst;
// First try to match against the secondary set of tables containing the		// First try to match against the secondary set of tables containing the
// short-form NEON instructions (e.g. "fadd.2s v0, v1, v2").		// short-form NEON instructions (e.g. "fadd.2s v0, v1, v2").
unsigned MatchResult =		unsigned MatchResult =
MatchInstructionImpl(Operands, Inst, ErrorInfo, MatchingInlineAsm, 1);		MatchInstructionImpl(Operands, Inst, ErrorInfo, MatchingInlineAsm, 1);

// If that fails, try against the alternate table containing long-form NEON:		// If that fails, try against the alternate table containing long-form NEON:
// "fadd v0.2s, v1.2s, v2.2s"		// "fadd v0.2s, v1.2s, v2.2s"
▲ Show 20 Lines • Show All 559 Lines • Show Last 20 Lines

lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp

Show First 20 Lines • Show All 1,510 Lines • ▼ Show 20 Lines	static DecodeStatus DecodeSystemPStateInstruction(llvm::MCInst &Inst,
uint32_t insn, uint64_t Addr,		uint32_t insn, uint64_t Addr,
const void *Decoder) {		const void *Decoder) {
uint64_t op1 = fieldFromInstruction(insn, 16, 3);		uint64_t op1 = fieldFromInstruction(insn, 16, 3);
uint64_t op2 = fieldFromInstruction(insn, 5, 3);		uint64_t op2 = fieldFromInstruction(insn, 5, 3);
uint64_t crm = fieldFromInstruction(insn, 8, 4);		uint64_t crm = fieldFromInstruction(insn, 8, 4);

uint64_t pstate_field = (op1 << 3) \| op2;		uint64_t pstate_field = (op1 << 3) \| op2;

		if (pstate_field == AArch64PState::PAN && crm > 1)
		return Fail;

Inst.addOperand(MCOperand::createImm(pstate_field));		Inst.addOperand(MCOperand::createImm(pstate_field));
Inst.addOperand(MCOperand::createImm(crm));		Inst.addOperand(MCOperand::createImm(crm));

bool ValidNamed;		bool ValidNamed;
const AArch64Disassembler *Dis =		const AArch64Disassembler *Dis =
static_cast<const AArch64Disassembler *>(Decoder);		static_cast<const AArch64Disassembler *>(Decoder);
(void)AArch64PState::PStateMapper().toString(pstate_field,		(void)AArch64PState::PStateMapper().toString(pstate_field,
Dis->getSubtargetInfo().getFeatureBits(), ValidNamed);		Dis->getSubtargetInfo().getFeatureBits(), ValidNamed);
▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

test/MC/AArch64/armv8.1a-pan.s

	// RUN: not llvm-mc -triple aarch64-none-linux-gnu -mattr=+v8.1a -show-encoding < %s 2> %t \| FileCheck %s			// RUN: not llvm-mc -triple aarch64-none-linux-gnu -mattr=+v8.1a -show-encoding < %s 2> %t \| FileCheck %s
	// RUN: FileCheck --check-prefix=CHECK-ERROR %s < %t			// RUN: FileCheck --check-prefix=CHECK-ERROR %s < %t

	.text			.text

	msr pan, #0			msr pan, #0
	// CHECK: msr PAN, #0 // encoding: [0x9f,0x40,0x00,0xd5]			// CHECK: msr PAN, #0 // encoding: [0x9f,0x40,0x00,0xd5]
	msr pan, #1			msr pan, #1
	// CHECK: msr PAN, #1 // encoding: [0x9f,0x41,0x00,0xd5]			// CHECK: msr PAN, #1 // encoding: [0x9f,0x41,0x00,0xd5]
	msr pan, x5			msr pan, x5
	// CHECK: msr PAN, x5 // encoding: [0x65,0x42,0x18,0xd5]			// CHECK: msr PAN, x5 // encoding: [0x65,0x42,0x18,0xd5]
	mrs x13, pan			mrs x13, pan
	// CHECK: mrs x13, PAN // encoding: [0x6d,0x42,0x38,0xd5]			// CHECK: mrs x13, PAN // encoding: [0x6d,0x42,0x38,0xd5]

	msr pan, #-1			msr pan, #-1
	msr pan, #20			msr pan, #2
	msr pan, w0			msr pan, w0
	mrs w0, pan			mrs w0, pan
	// CHECK-ERROR: error: immediate must be an integer in range [0, 15].			// CHECK-ERROR: error: immediate must be an integer in range [0, 1].
	// CHECK-ERROR: msr pan, #-1			// CHECK-ERROR: msr pan, #-1
	// CHECK-ERROR: ^			// CHECK-ERROR: ^
	// CHECK-ERROR: error: immediate must be an integer in range [0, 15].			// CHECK-ERROR: error: immediate must be an integer in range [0, 1].
	// CHECK-ERROR: msr pan, #20			// CHECK-ERROR: msr pan, #2
	// CHECK-ERROR: ^			// CHECK-ERROR: ^
	// CHECK-ERROR: error: immediate must be an integer in range [0, 15].			// CHECK-ERROR: error: invalid operand for instruction
	// CHECK-ERROR: msr pan, w0			// CHECK-ERROR: msr pan, w0
	// CHECK-ERROR: ^			// CHECK-ERROR: ^
	// CHECK-ERROR: error: invalid operand for instruction			// CHECK-ERROR: error: invalid operand for instruction
	// CHECK-ERROR: mrs w0, pan			// CHECK-ERROR: mrs w0, pan
	// CHECK-ERROR: ^			// CHECK-ERROR: ^

test/MC/Disassembler/AArch64/armv8.1a-pan.txt

	# RUN: llvm-mc -triple aarch64-none-linux-gnu -mattr=+v8.1a --disassemble < %s \| FileCheck %s			# RUN: llvm-mc -triple aarch64-none-linux-gnu -mattr=+v8.1a --disassemble < %s \| FileCheck %s

	0x9f,0x40,0x00,0xd5			0x9f,0x40,0x00,0xd5
	0x9f,0x41,0x00,0xd5			0x9f,0x41,0x00,0xd5
				0x9f,0x42,0x00,0xd5
	0x65,0x42,0x18,0xd5			0x65,0x42,0x18,0xd5
	0x6d,0x42,0x38,0xd5			0x6d,0x42,0x38,0xd5
	# CHECK: msr PAN, #0			# CHECK: msr PAN, #0
	# CHECK: msr PAN, #1			# CHECK: msr PAN, #1
				# CHECK-NOT: msr PAN, #2
	# CHECK: msr PAN, x5			# CHECK: msr PAN, x5
	# CHECK: mrs x13, PAN			# CHECK: mrs x13, PAN

This is an archive of the discontinued LLVM Phabricator instance.

[MC layer][AArch64] llvm-mc accepts 4-bit immediate values for "msr pan, #imm"ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 35231

lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp

lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp

test/MC/AArch64/armv8.1a-pan.s

test/MC/Disassembler/AArch64/armv8.1a-pan.txt

[MC layer][AArch64] llvm-mc accepts 4-bit immediate values for "msr pan, #imm"
ClosedPublic