This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Clean up parsing fence arguments
AbandonedPublic

Authored by jrtc27 on Mar 6 2021, 11:47 AM.

Download Raw Diff

Details

Reviewers

craig.topper
luismarques
asb

Summary

Currently this is done rather hackily by letting it be parsed as an
MCSymbolRefExpr whose name we then validate. The logic is also split
across isFenceArg and addFenceArgOperands as a result. Instead, use a
custom parser so we look at the identifier token directly and construct
the immediate at the same time. As well as being cleaner, this also
allows us to give better error messages.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	70 ms	x64 debian > Polly.Isl/CodeGen::param_div_div_div_2.ll
	40 ms	x64 debian > Polly.Isl/CodeGen::scop_expander_insert_point.ll

Event Timeline

jrtc27 created this revision.Mar 6 2021, 11:47 AM

Herald added subscribers: vkmr, frasercrmck, evandro and 23 others. · View Herald TranscriptMar 6 2021, 11:47 AM

jrtc27 requested review of this revision.Mar 6 2021, 11:47 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2021, 11:47 AM

Herald added subscribers: llvm-commits, MaskRay. · View Herald Transcript

Harbormaster completed remote builds in B92501: Diff 328795.Mar 6 2021, 8:13 PM

craig.topper added inline comments.Mar 7 2021, 10:18 AM

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
1688	I see this SMLoc::getFromPointer(S.getPointer() - 1) repeated a lot in the assembly parser. Is this doing something I don't understand and setting a valid end location or are we just frequently setting an invalid end location?

jrtc27 added inline comments.Mar 7 2021, 10:28 AM

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
1688	Hm, indeed, this seems to be a consistent misunderstanding in the parser here. Normally you'd lex and _then_ get E based on the _new_ location, but this creates a -1-sized range. The -1 itself is also odd, though seems to be rather pervasive across the backends; SMRange is meant to be half-open, so the -1 is already accounted for.

craig.topper added inline comments.Mar 7 2021, 10:32 AM

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
1688	You don't even need to Lex the next token. There's a getEndLoc() on the Token class.

jrtc27 added inline comments.Mar 7 2021, 11:34 AM

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
1688	In this case, yeah, though some cases in this file are more complex and so would likely need to use that kind of pattern instead.

LGTM.

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
1706–1717	Keep the original in-line formatting?
llvm/test/MC/RISCV/rv32i-invalid.s
5–7	letters -> operand letters?
9	ditto.

This revision is now accepted and ready to land.Mar 15 2021, 3:01 AM

Great cleanup, thanks for this.

Obsoleted by d558a70abf2624bf0b26fa8cd506277214fea197

Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2023, 2:51 PM

Herald added subscribers: jobnoorman, luke, • pcwang-thead and 3 others. · View Herald Transcript

evandro removed a subscriber: evandro.May 17 2023, 3:57 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

AsmParser/

RISCVAsmParser.cpp

104 lines

RISCVInstrInfo.td

4 lines

test/

MC/

RISCV/

rv32i-invalid.s

8 lines

Diff 328795

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp

Show First 20 Lines • Show All 151 Lines • ▼ Show 20 Lines
#include "RISCVGenAsmMatcher.inc"		#include "RISCVGenAsmMatcher.inc"

OperandMatchResultTy parseCSRSystemRegister(OperandVector &Operands);		OperandMatchResultTy parseCSRSystemRegister(OperandVector &Operands);
OperandMatchResultTy parseImmediate(OperandVector &Operands);		OperandMatchResultTy parseImmediate(OperandVector &Operands);
OperandMatchResultTy parseRegister(OperandVector &Operands,		OperandMatchResultTy parseRegister(OperandVector &Operands,
bool AllowParens = false);		bool AllowParens = false);
OperandMatchResultTy parseMemOpBaseReg(OperandVector &Operands);		OperandMatchResultTy parseMemOpBaseReg(OperandVector &Operands);
OperandMatchResultTy parseAtomicMemOp(OperandVector &Operands);		OperandMatchResultTy parseAtomicMemOp(OperandVector &Operands);
		OperandMatchResultTy parseFenceArg(OperandVector &Operands);
OperandMatchResultTy parseOperandWithModifier(OperandVector &Operands);		OperandMatchResultTy parseOperandWithModifier(OperandVector &Operands);
OperandMatchResultTy parseBareSymbol(OperandVector &Operands);		OperandMatchResultTy parseBareSymbol(OperandVector &Operands);
OperandMatchResultTy parseCallSymbol(OperandVector &Operands);		OperandMatchResultTy parseCallSymbol(OperandVector &Operands);
OperandMatchResultTy parsePseudoJumpSymbol(OperandVector &Operands);		OperandMatchResultTy parsePseudoJumpSymbol(OperandVector &Operands);
OperandMatchResultTy parseJALOffset(OperandVector &Operands);		OperandMatchResultTy parseJALOffset(OperandVector &Operands);
OperandMatchResultTy parseVTypeI(OperandVector &Operands);		OperandMatchResultTy parseVTypeI(OperandVector &Operands);
OperandMatchResultTy parseMaskReg(OperandVector &Operands);		OperandMatchResultTy parseMaskReg(OperandVector &Operands);

▲ Show 20 Lines • Show All 243 Lines • ▼ Show 20 Lines	bool isTPRelAddSymbol() const {
return RISCVAsmParser::classifySymbolRef(getImm(), VK) &&		return RISCVAsmParser::classifySymbolRef(getImm(), VK) &&
VK == RISCVMCExpr::VK_RISCV_TPREL_ADD;		VK == RISCVMCExpr::VK_RISCV_TPREL_ADD;
}		}

bool isCSRSystemRegister() const { return isSystemRegister(); }		bool isCSRSystemRegister() const { return isSystemRegister(); }

bool isVTypeI() const { return isVType(); }		bool isVTypeI() const { return isVType(); }

/// Return true if the operand is a valid for the fence instruction e.g.
/// ('iorw').
bool isFenceArg() const {		bool isFenceArg() const {
		int64_t Imm;
		RISCVMCExpr::VariantKind VK = RISCVMCExpr::VK_RISCV_None;
if (!isImm())		if (!isImm())
return false;		return false;
const MCExpr *Val = getImm();		bool IsConstantImm = evaluateConstantImm(getImm(), Imm, VK);
auto *SVal = dyn_cast<MCSymbolRefExpr>(Val);		if (!IsConstantImm)
if (!SVal \|\| SVal->getKind() != MCSymbolRefExpr::VK_None)
return false;

StringRef Str = SVal->getSymbol().getName();
// Letters must be unique, taken from 'iorw', and in ascending order. This
// holds as long as each individual character is one of 'iorw' and is
// greater than the previous character.
char Prev = '\0';
for (char c : Str) {
if (c != 'i' && c != 'o' && c != 'r' && c != 'w')
return false;
if (c <= Prev)
return false;		return false;
Prev = c;		int64_t Mask = RISCVFenceField::I \| RISCVFenceField::O \|
}		RISCVFenceField::R \| RISCVFenceField::W;
return true;		return (Imm & ~Mask) == 0;
}		}

/// Return true if the operand is a valid floating point rounding mode.		/// Return true if the operand is a valid floating point rounding mode.
bool isFRMArg() const {		bool isFRMArg() const {
if (!isImm())		if (!isImm())
return false;		return false;
const MCExpr *Val = getImm();		const MCExpr *Val = getImm();
auto *SVal = dyn_cast<MCSymbolRefExpr>(Val);		auto *SVal = dyn_cast<MCSymbolRefExpr>(Val);
▲ Show 20 Lines • Show All 376 Lines • ▼ Show 20 Lines	void addRegOperands(MCInst &Inst, unsigned N) const {
Inst.addOperand(MCOperand::createReg(getReg()));		Inst.addOperand(MCOperand::createReg(getReg()));
}		}

void addImmOperands(MCInst &Inst, unsigned N) const {		void addImmOperands(MCInst &Inst, unsigned N) const {
assert(N == 1 && "Invalid number of operands!");		assert(N == 1 && "Invalid number of operands!");
addExpr(Inst, getImm());		addExpr(Inst, getImm());
}		}

void addFenceArgOperands(MCInst &Inst, unsigned N) const {
assert(N == 1 && "Invalid number of operands!");
// isFenceArg has validated the operand, meaning this cast is safe
auto SE = cast<MCSymbolRefExpr>(getImm());

unsigned Imm = 0;
for (char c : SE->getSymbol().getName()) {
switch (c) {
default:
llvm_unreachable("FenceArg must contain only [iorw]");
case 'i': Imm \|= RISCVFenceField::I; break;
case 'o': Imm \|= RISCVFenceField::O; break;
case 'r': Imm \|= RISCVFenceField::R; break;
case 'w': Imm \|= RISCVFenceField::W; break;
}
}
Inst.addOperand(MCOperand::createImm(Imm));
}

void addCSRSystemRegisterOperands(MCInst &Inst, unsigned N) const {		void addCSRSystemRegisterOperands(MCInst &Inst, unsigned N) const {
assert(N == 1 && "Invalid number of operands!");		assert(N == 1 && "Invalid number of operands!");
Inst.addOperand(MCOperand::createImm(SysReg.Encoding));		Inst.addOperand(MCOperand::createImm(SysReg.Encoding));
}		}

void addVTypeIOperands(MCInst &Inst, unsigned N) const {		void addVTypeIOperands(MCInst &Inst, unsigned N) const {
assert(N == 1 && "Invalid number of operands!");		assert(N == 1 && "Invalid number of operands!");
Inst.addOperand(MCOperand::createImm(getVType()));		Inst.addOperand(MCOperand::createImm(getVType()));
▲ Show 20 Lines • Show All 255 Lines • ▼ Show 20 Lines	case Match_InvalidSImm21Lsb0JAL:
return generateImmOutOfRangeError(		return generateImmOutOfRangeError(
Operands, ErrorInfo, -(1 << 20), (1 << 20) - 2,		Operands, ErrorInfo, -(1 << 20), (1 << 20) - 2,
"immediate must be a multiple of 2 bytes in the range");		"immediate must be a multiple of 2 bytes in the range");
case Match_InvalidCSRSystemRegister: {		case Match_InvalidCSRSystemRegister: {
return generateImmOutOfRangeError(Operands, ErrorInfo, 0, (1 << 12) - 1,		return generateImmOutOfRangeError(Operands, ErrorInfo, 0, (1 << 12) - 1,
"operand must be a valid system register "		"operand must be a valid system register "
"name or an integer in the range");		"name or an integer in the range");
}		}
case Match_InvalidFenceArg: {
SMLoc ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();
return Error(
ErrorLoc,
"operand must be formed of letters selected in-order from 'iorw'");
}
case Match_InvalidFRMArg: {		case Match_InvalidFRMArg: {
SMLoc ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();		SMLoc ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();
return Error(		return Error(
ErrorLoc,		ErrorLoc,
"operand must be a valid floating point rounding mode mnemonic");		"operand must be a valid floating point rounding mode mnemonic");
}		}
case Match_InvalidBareSymbol: {		case Match_InvalidBareSymbol: {
SMLoc ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();		SMLoc ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();
▲ Show 20 Lines • Show All 576 Lines • ▼ Show 20 Lines	if (OptionalImmOp && !OptionalImmOp->isImmZero()) {
Error(OptionalImmOp->getStartLoc(), "optional integer offset must be 0",		Error(OptionalImmOp->getStartLoc(), "optional integer offset must be 0",
SMRange(OptionalImmOp->getStartLoc(), OptionalImmOp->getEndLoc()));		SMRange(OptionalImmOp->getStartLoc(), OptionalImmOp->getEndLoc()));
return MatchOperand_ParseFail;		return MatchOperand_ParseFail;
}		}

return MatchOperand_Success;		return MatchOperand_Success;
}		}

		OperandMatchResultTy RISCVAsmParser::parseFenceArg(OperandVector &Operands) {
		SMLoc S = getLoc();
		SMLoc E = SMLoc::getFromPointer(S.getPointer() - 1);
		craig.topperUnsubmitted Not Done Reply Inline Actions I see this SMLoc::getFromPointer(S.getPointer() - 1) repeated a lot in the assembly parser. Is this doing something I don't understand and setting a valid end location or are we just frequently setting an invalid end location? craig.topper: I see this SMLoc::getFromPointer(S.getPointer() - 1) repeated a lot in the assembly parser. Is…
		jrtc27AuthorUnsubmitted Done Reply Inline Actions Hm, indeed, this seems to be a consistent misunderstanding in the parser here. Normally you'd lex and _then_ get E based on the _new_ location, but this creates a -1-sized range. The -1 itself is also odd, though seems to be rather pervasive across the backends; SMRange is meant to be half-open, so the -1 is already accounted for. jrtc27: Hm, indeed, this seems to be a consistent misunderstanding in the parser here. Normally you'd…
		craig.topperUnsubmitted Not Done Reply Inline Actions You don't even need to Lex the next token. There's a getEndLoc() on the Token class. craig.topper: You don't even need to Lex the next token. There's a getEndLoc() on the Token class.
		jrtc27AuthorUnsubmitted Done Reply Inline Actions In this case, yeah, though some cases in this file are more complex and so would likely need to use that kind of pattern instead. jrtc27: In this case, yeah, though some cases in this file are more complex and so would likely need to…
		const MCExpr *Res;

		if (getLexer().getKind() != AsmToken::Identifier) {
		Error(S, "operand must be formed of letters selected in-order from 'iorw'");
		return MatchOperand_ParseFail;
		}

		StringRef Identifier = getParser().getTok().getIdentifier();
		getParser().Lex();

		// Letters must be unique, taken from 'iorw', and in ascending order. This
		// holds as long as each individual character is one of 'iorw' and is
		// greater than the previous character.
		char Prev = '\0';
		unsigned Imm = 0;
		for (char c : Identifier) {
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'c' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'c' [readability-identifier-naming]…
		switch (c) {
		case 'i':
		Imm \|= RISCVFenceField::I;
		break;
		case 'o':
		Imm \|= RISCVFenceField::O;
		break;
		case 'r':
		Imm \|= RISCVFenceField::R;
		break;
		case 'w':
		Imm \|= RISCVFenceField::W;
		break;
		luismarquesUnsubmitted Not Done Reply Inline Actions Keep the original in-line formatting? luismarques: Keep the original in-line formatting?
		default:
		Error(S, "letters must be selected from 'iorw'");
		return MatchOperand_ParseFail;
		}

		if (c <= Prev) {
		if (c == Prev)
		Error(S, "letters must not be duplicated");
		else
		Error(S, "letters must be in the order 'iorw'");
		return MatchOperand_ParseFail;
		}
		Prev = c;
		}

		Res = MCConstantExpr::create(Imm, getContext());
		Operands.push_back(RISCVOperand::createImm(Res, S, E, isRV64()));
		return MatchOperand_Success;
		}

/// Looks at a token type and creates the relevant operand from this		/// Looks at a token type and creates the relevant operand from this
/// information, adding to Operands. If operand was parsed, returns false, else		/// information, adding to Operands. If operand was parsed, returns false, else
/// true.		/// true.
bool RISCVAsmParser::parseOperand(OperandVector &Operands, StringRef Mnemonic) {		bool RISCVAsmParser::parseOperand(OperandVector &Operands, StringRef Mnemonic) {
// Check if the current operand has a custom associated parser, if so, try to		// Check if the current operand has a custom associated parser, if so, try to
// custom parse the operand, or fallback to the general approach.		// custom parse the operand, or fallback to the general approach.
OperandMatchResultTy Result =		OperandMatchResultTy Result =
MatchOperandParserImpl(Operands, Mnemonic, /ParseForAllFeatures=/true);		MatchOperandParserImpl(Operands, Mnemonic, /ParseForAllFeatures=/true);
▲ Show 20 Lines • Show All 928 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVInstrInfo.td

	Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines
	}			}

	class UImmAsmOperand<int width, string suffix = "">			class UImmAsmOperand<int width, string suffix = "">
	: ImmAsmOperand<"U", width, suffix> {			: ImmAsmOperand<"U", width, suffix> {
	}			}

	def FenceArg : AsmOperandClass {			def FenceArg : AsmOperandClass {
	let Name = "FenceArg";			let Name = "FenceArg";
	let RenderMethod = "addFenceArgOperands";			let RenderMethod = "addImmOperands";
	let DiagnosticType = "InvalidFenceArg";			let ParserMethod = "parseFenceArg";
	}			}

	def fencearg : Operand<XLenVT> {			def fencearg : Operand<XLenVT> {
	let ParserMatchClass = FenceArg;			let ParserMatchClass = FenceArg;
	let PrintMethod = "printFenceArg";			let PrintMethod = "printFenceArg";
	let DecoderMethod = "decodeUImmOperand<4>";			let DecoderMethod = "decodeUImmOperand<4>";
	let OperandType = "OPERAND_UIMM4";			let OperandType = "OPERAND_UIMM4";
	let OperandNamespace = "RISCVOp";			let OperandNamespace = "RISCVOp";
	▲ Show 20 Lines • Show All 1,161 Lines • Show Last 20 Lines

llvm/test/MC/RISCV/rv32i-invalid.s

	# RUN: not llvm-mc -triple riscv32 < %s 2>&1 \| FileCheck %s			# RUN: not llvm-mc -triple riscv32 < %s 2>&1 \| FileCheck %s

	# Out of range immediates			# Out of range immediates
	## fencearg			## fencearg
	fence iorw, iore # CHECK: :[[@LINE]]:13: error: operand must be formed of letters selected in-order from 'iorw'			fence iorw, iore # CHECK: :[[@LINE]]:13: error: letters must be selected from 'iorw'
	fence wr, wr # CHECK: :[[@LINE]]:7: error: operand must be formed of letters selected in-order from 'iorw'			fence wr, wr # CHECK: :[[@LINE]]:7: error: letters must be in the order 'iorw'
	fence rw, rr # CHECK: :[[@LINE]]:11: error: operand must be formed of letters selected in-order from 'iorw'			fence rw, rr # CHECK: :[[@LINE]]:11: error: letters must not be duplicated
				luismarquesUnsubmitted Not Done Reply Inline Actions letters -> operand letters? luismarques: letters -> operand letters?
	fence 1, rw # CHECK: :[[@LINE]]:7: error: operand must be formed of letters selected in-order from 'iorw'			fence 1, rw # CHECK: :[[@LINE]]:7: error: operand must be formed of letters selected in-order from 'iorw'
	fence unknown, unknown # CHECK: :[[@LINE]]:7: error: operand must be formed of letters selected in-order from 'iorw'			fence unknown, unknown # CHECK: :[[@LINE]]:7: error: letters must be selected from 'iorw'
				luismarquesUnsubmitted Not Done Reply Inline Actions ditto. luismarques: ditto.

	## uimm5			## uimm5
	slli a0, a0, 32 # CHECK: :[[@LINE]]:14: error: immediate must be an integer in the range [0, 31]			slli a0, a0, 32 # CHECK: :[[@LINE]]:14: error: immediate must be an integer in the range [0, 31]
	srli a0, a0, -1 # CHECK: :[[@LINE]]:14: error: immediate must be an integer in the range [0, 31]			srli a0, a0, -1 # CHECK: :[[@LINE]]:14: error: immediate must be an integer in the range [0, 31]
	srai a0, a0, -19 # CHECK: :[[@LINE]]:14: error: immediate must be an integer in the range [0, 31]			srai a0, a0, -19 # CHECK: :[[@LINE]]:14: error: immediate must be an integer in the range [0, 31]
	csrrwi a1, 0x1, -1 # CHECK: :[[@LINE]]:17: error: immediate must be an integer in the range [0, 31]			csrrwi a1, 0x1, -1 # CHECK: :[[@LINE]]:17: error: immediate must be an integer in the range [0, 31]
	csrrsi t1, 999, 32 # CHECK: :[[@LINE]]:17: error: immediate must be an integer in the range [0, 31]			csrrsi t1, 999, 32 # CHECK: :[[@LINE]]:17: error: immediate must be an integer in the range [0, 31]
	csrrci x0, 43, -90 # CHECK: :[[@LINE]]:16: error: immediate must be an integer in the range [0, 31]			csrrci x0, 43, -90 # CHECK: :[[@LINE]]:16: error: immediate must be an integer in the range [0, 31]
	▲ Show 20 Lines • Show All 164 Lines • Show Last 20 Lines