This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/AsmParser/
-
Target/
-
RISCV/
-
AsmParser/
-
RISCVAsmParser.cpp
-
test/MC/RISCV/
-
MC/
-
RISCV/
-
rvi-pseudos.s
-
rvv/
-
vsetvl.s

Differential D153204

RISCVAsmParser: support comments in more places
Needs ReviewPublic

Authored by MaskRay on Jun 17 2023, 10:22 AM.

Download Raw Diff

Details

Reviewers

asb
craig.topper
jrtc27
abel-bernabeu
DavidSpickett
rogfer01
barannikov88
• s-barannikov

Summary

Currently we fail to parse an instruction when comments occur in certain places
e.g. after a register or after a bare symbol:

x.s:1:8: error: unexpected token
addi a0/**/, a1, 1
       ^
x.s:2:15: error: unknown token in expression
lla a0, sym - /**/1
              ^
x.s:3:18: error: operand must be e[8|16|32|64|128|256|512|1024],m[1|2|4|8|f2|f4|f8],[ta|tu],[ma|mu]
vsetivli a2, 15, e32/**/, m1, ta, ma
                 ^

Call MCAsmParser::Lex instead of
MCAsmLexer::Lex to fix the issue by consuming the comments.

Reported by https://discourse.llvm.org/t/interleaving-several-c-style-comments-in-the-same-inline-assembly-line/71353

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

MaskRay created this revision.Jun 17 2023, 10:22 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2023, 10:22 AM

Herald added subscribers: luke, VincentWu, vkmr and 25 others. · View Herald Transcript

MaskRay requested review of this revision.Jun 17 2023, 10:22 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2023, 10:22 AM

Herald added subscribers: llvm-commits, wangpc, eopXD. · View Herald Transcript

MaskRay edited the summary of this revision. (Show Details)Jun 17 2023, 10:26 AM

support vsetvli

MaskRay mentioned this in D153008: [RISCV] Allow slash-star comments in instruction operands.Jun 17 2023, 10:39 AM

Harbormaster completed remote builds in B239622: Diff 532415.Jun 17 2023, 11:13 AM

Use "LI" rather than "VSETIVLI". It really does not matter what instruction you use for testing the handling of comments. Think of those backporting the patch to very old versions of LLVM.

Make an independent file for this test. When the test fails because of the handling of comments the test name would point very straight to the issue.

Check that the comments are placed into the assembler rather than ignored.

We spent a bit of time on the review of the original patch proposal for figuring out those things already. Cherry-pick my test at least :-D

https://reviews.llvm.org/D153008

In D153204#4430646, @abel-bernabeu wrote:

Use "LI" rather than "VSETIVLI". It really does not matter what instruction you use for testing the handling of comments. Think of those backporting the patch to very old versions of LLVM.

One of the changes is in the vsetivli specific parsing code.

In D153204#4430679, @craig.topper wrote:

In D153204#4430646, @abel-bernabeu wrote:

Use "LI" rather than "VSETIVLI". It really does not matter what instruction you use for testing the handling of comments. Think of those backporting the patch to very old versions of LLVM.

One of the changes is in the vsetivli specific parsing code.

The change proposed by Sergei Barannikov was to just search and replace getLexer().Lex() for getParser.Lex() or simply Lex(), which is exactly what I did on the original merge request. This solution was preferred over surgically fixing just the occurrences that were deemed needed for most common cases.

This patch, replaces just 5 out of 23 occurrences which are needed for the vsetivli instruction to get interleaved comments working. I do not see how this merge request is a better alternative to https://reviews.llvm.org/D153008 or why having the functionality only working for vsetivli is in any way better than just applying a clean search and replace for fixing every instruction.

There are reviewers missing that gave useful feedback for the older merge request.

Sergei Barannikov will not be mentioned on the git history, when we are just applying, testing and committing his generic solution for fixing the problem of comments interleaving on virtually any backend.

Ultimately the code owners can fix in whatever way they feel is better, and as a newcomer I would stick to whatever is agreed here, but my personal vote is to reject this merge request and unblock the original one.

abel-bernabeu added reviewers: DavidSpickett, rogfer01, barannikov88, • s-barannikov.Jun 18 2023, 12:52 PM

This set of changes has an advantage in that it has better test coverage.
However, I think that taking over someone's work without their consent is discouraging. We should be leaning towards polishing the original review.

Thank you for sharing your perspective. I understand your concern about taking over someone's work without their consent.
Though, it could be argued that the reporter posted a patch first, some might consider that others were just late in doing so, so technically a rework, not just "taking over" (unfortunately not independent due to chronological order).
In this particular case, the original patch had unrelated changes and test coverage issue after some review comments, which made it difficult to address the problem effectively.
As a community, I believe our primary goal is to ensure the quality and functionality of the codebase.
If people are familiar with me, they would know that I have invested significant effort in assisting others to improve their test writing. In many cases, it would save me a substantial amount of time, but I did not "take over".

However, I believe there is a cut-off point, and it would be more beneficial to demonstrate how to fix the issue for a few instances to effectively address this problem.
The patch I submitted was aimed at solving the specific issue at hand. I started from scratch, but I am happy to add Abel Bernabeu as a co-author.

For those who are not familiar with my background, I have been contributing to MC regularly since 2018.

This patch will address 5 out of the remaining 13 use cases. I believe it effectively tackles the most noticeable issues, but it would be beneficial to address the remaining 8 as well. However, I'm not in favor of a simple search/replace approach without code coverage (so I'd like to be removed as a co-author in D153008).

In D153204#4431251, @abel-bernabeu wrote:

In D153204#4430679, @craig.topper wrote:

In D153204#4430646, @abel-bernabeu wrote:

Use "LI" rather than "VSETIVLI". It really does not matter what instruction you use for testing the handling of comments. Think of those backporting the patch to very old versions of LLVM.

One of the changes is in the vsetivli specific parsing code.

The change proposed by Sergei Barannikov was to just search and replace getLexer().Lex() for getParser.Lex() or simply Lex(), which is exactly what I did on the original merge request. This solution was preferred over surgically fixing just the occurrences that were deemed needed for most common cases.

I think this was a suggestion to help identify the problem, but not that the proposed patch should do this.

This patch, replaces just 5 out of 23 occurrences which are needed for the vsetivli instruction to get interleaved comments working. I do not see how this merge request is a better alternative to https://reviews.llvm.org/D153008 or why having the functionality only working for vsetivli is in any way better than just applying a clean search and replace for fixing every instruction.

I landed a couple commits to fix patterns like if (getLexer().is(...)) { getLexer().Lex(); to use if (parseOptionalToken(AsmToken::Minus)). So the number of occurrences has decreased.

There are reviewers missing that gave useful feedback for the older merge request.

I took the list that usually reviews the changes. I left a comment on the other Differential, so those who want to review can come here with that notification.

Sergei Barannikov will not be mentioned on the git history, when we are just applying, testing and committing his generic solution for fixing the problem of comments interleaving on virtually any backend.

Ultimately the code owners can fix in whatever way they feel is better, and as a newcomer I would stick to whatever is agreed here, but my personal vote is to reject this merge request and unblock the original one.

I am happy to add a thanks-to line, but this is probably not a situation that we use Co-authored-by:.

In D153204#4430646, @abel-bernabeu wrote:

Use "LI" rather than "VSETIVLI". It really does not matter what instruction you use for testing the handling of comments. Think of those backporting the patch to very old versions of LLVM.

Answered. The special parsing of VSETIVLI needs a test, not LI.

Make an independent file for this test. When the test fails because of the handling of comments the test name would point very straight to the issue.

Not sure about this. See https://maskray.me/blog/2021-08-08-toolchain-testing "I don't know an existing test can be enhanced"

Check that the comments are placed into the assembler rather than ignored.

Not sure about this. See my blog post about "The test checks at the wrong layer".

How to preserve comments can be tested elsewhere, e.g. test/MC/AsmParser/preserve-comments*, test/MC/AsmParser/inline-comments.ll.
These tests may need improvement, but that is a separate issue.

We spent a bit of time on the review of the original patch proposal for figuring out those things already. Cherry-pick my test at least :-D

https://reviews.llvm.org/D153008

If the important thing is to not miss "It has been reported by one of Esperanto's customers". I can mention this as well.

In D153204#4436782, @MaskRay wrote:

We spent a bit of time on the review of the original patch proposal for figuring out those things already. Cherry-pick my test at least :-D

https://reviews.llvm.org/D153008

If the important thing is to not miss "It has been reported by one of Esperanto's customers". I can mention this as well.

Note that I honestly thought "Reported by https://discourse.llvm.org/t/interleaving-several-c-style-comments-in-the-same-inline-assembly-line/71353" was sufficient.
Also, regarding the verboseness of the patch. I think this patch strikes a good balance, the other one has too many paragraphs.

@MaskRay
I don't doubt your experience, nor your will to help others, and I certainly didn't mean to offend. My point was that creating and insisting on an alternative review does not encourage newcomers to contribute.
I think it would be better if we helped Abel to get his review right instead. It would also save everyone's time arguing about which review is better and relieve the reviewers from having to judge.

@abel-bernabeu

The change proposed by Sergei Barannikov was to just search and replace getLexer().Lex() for getParser.Lex() or simply Lex(), which is exactly what I did on the original merge request. This solution was preferred over surgically fixing just the occurrences that were deemed needed for most common cases.

I think this was a suggestion to help identify the problem, but not that the proposed patch should do this.

It was a suggestion to help identify the problem, I don't expect any credits for this.

Also, regarding the verboseness of the patch. I think this patch strikes a good balance, the other one has too many paragraphs.

I also agree here. The description could be much less verbose.

@MaskRay

I recognize your contribution to the LLVM project and thank you for that. Any employer will be lucky to hire you.

But at the same time I also need to ensure that, when I spend time sponsored by my employer on fixing something, the work email address appears on the commit message in a way that can be parsed by automated tools. A link to a Discourse thread does not track the credit back to my employer the way the author git field or the "Co-Authored-By: " tag do.

In D153204#4437044, @barannikov88 wrote:

@MaskRay
I don't doubt your experience, nor your will to help others, and I certainly didn't mean to offend. My point was that creating and insisting on an alternative review does not encourage newcomers to contribute.
I think it would be better if we helped Abel to get his review right instead. It would also save everyone's time arguing about which review is better and relieve the reviewers from having to judge.

Sorry, but I'll disagree here. As I mentioned,

However, I believe there is a cut-off point, and it would be more beneficial to demonstrate how to fix the issue for a few instances to effectively address this problem.

The patch I submitted was aimed at solving the specific issue at hand.

We could spend more energy on suggesting the exact fix, but then that's just like the reviewers write the entire patch. (Actually I sometimes did this for others, but I don't always.)

Also, there is still an opportunity for @abel-bernabeu to get a credit for a commit

This patch will address 5 out of the remaining 13 use cases. I believe it effectively tackles the most noticeable issues, but it would be beneficial to address the remaining 8 as well.

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

AsmParser/

RISCVAsmParser.cpp

12 lines

test/

MC/

RISCV/

rvi-pseudos.s

10 lines

rvv/

vsetvl.s

2 lines

Diff 532415

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp

Show First 20 Lines • Show All 1,611 Lines • ▼ Show 20 Lines	if (!RegNo) {
if (HadParens)		if (HadParens)
getLexer().UnLex(LParen);		getLexer().UnLex(LParen);
return MatchOperand_NoMatch;		return MatchOperand_NoMatch;
}		}
if (HadParens)		if (HadParens)
Operands.push_back(RISCVOperand::createToken("(", FirstS));		Operands.push_back(RISCVOperand::createToken("(", FirstS));
SMLoc S = getLoc();		SMLoc S = getLoc();
SMLoc E = SMLoc::getFromPointer(S.getPointer() + Name.size());		SMLoc E = SMLoc::getFromPointer(S.getPointer() + Name.size());
getLexer().Lex();		Lex();
Operands.push_back(RISCVOperand::createReg(RegNo, S, E));		Operands.push_back(RISCVOperand::createReg(RegNo, S, E));
}		}

if (HadParens) {		if (HadParens) {
getParser().Lex(); // Eat ')'		getParser().Lex(); // Eat ')'
Operands.push_back(RISCVOperand::createToken(")", getLoc()));		Operands.push_back(RISCVOperand::createToken(")", getLoc()));
}		}

▲ Show 20 Lines • Show All 344 Lines • ▼ Show 20 Lines	OperandMatchResultTy RISCVAsmParser::parseBareSymbol(OperandVector &Operands) {

MCBinaryExpr::Opcode Opcode;		MCBinaryExpr::Opcode Opcode;
switch (getLexer().getKind()) {		switch (getLexer().getKind()) {
default:		default:
Operands.push_back(RISCVOperand::createImm(Res, S, E, isRV64()));		Operands.push_back(RISCVOperand::createImm(Res, S, E, isRV64()));
return MatchOperand_Success;		return MatchOperand_Success;
case AsmToken::Plus:		case AsmToken::Plus:
Opcode = MCBinaryExpr::Add;		Opcode = MCBinaryExpr::Add;
getLexer().Lex();		Lex();
break;		break;
case AsmToken::Minus:		case AsmToken::Minus:
Opcode = MCBinaryExpr::Sub;		Opcode = MCBinaryExpr::Sub;
getLexer().Lex();		Lex();
break;		break;
}		}

const MCExpr *Expr;		const MCExpr *Expr;
if (getParser().parseExpression(Expr, E))		if (getParser().parseExpression(Expr, E))
return MatchOperand_ParseFail;		return MatchOperand_ParseFail;
Res = MCBinaryExpr::create(Opcode, Res, Expr, getContext());		Res = MCBinaryExpr::create(Opcode, Res, Expr, getContext());
Operands.push_back(RISCVOperand::createImm(Res, S, E, isRV64()));		Operands.push_back(RISCVOperand::createImm(Res, S, E, isRV64()));
▲ Show 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	OperandMatchResultTy RISCVAsmParser::parseVTypeI(OperandVector &Operands) {
if (getLexer().isNot(AsmToken::Identifier))		if (getLexer().isNot(AsmToken::Identifier))
return MatchOperand_NoMatch;		return MatchOperand_NoMatch;

StringRef Identifier = getTok().getIdentifier();		StringRef Identifier = getTok().getIdentifier();

if (parseVTypeToken(Identifier, State, Sew, Lmul, Fractional, TailAgnostic,		if (parseVTypeToken(Identifier, State, Sew, Lmul, Fractional, TailAgnostic,
MaskAgnostic))		MaskAgnostic))
return MatchOperand_NoMatch;		return MatchOperand_NoMatch;
		Lex();
getLexer().Lex();

while (parseOptionalToken(AsmToken::Comma)) {		while (parseOptionalToken(AsmToken::Comma)) {
if (getLexer().isNot(AsmToken::Identifier))		if (getLexer().isNot(AsmToken::Identifier))
break;		break;

Identifier = getTok().getIdentifier();		Identifier = getTok().getIdentifier();

if (parseVTypeToken(Identifier, State, Sew, Lmul, Fractional, TailAgnostic,		if (parseVTypeToken(Identifier, State, Sew, Lmul, Fractional, TailAgnostic,
MaskAgnostic))		MaskAgnostic))
break;		break;
		Lex();
getLexer().Lex();
}		}

if (getLexer().is(AsmToken::EndOfStatement) && State == VTypeState_Done) {		if (getLexer().is(AsmToken::EndOfStatement) && State == VTypeState_Done) {
RISCVII::VLMUL VLMUL = RISCVVType::encodeLMUL(Lmul, Fractional);		RISCVII::VLMUL VLMUL = RISCVVType::encodeLMUL(Lmul, Fractional);

unsigned VTypeI =		unsigned VTypeI =
RISCVVType::encodeVTYPE(VLMUL, Sew, TailAgnostic, MaskAgnostic);		RISCVVType::encodeVTYPE(VLMUL, Sew, TailAgnostic, MaskAgnostic);
Operands.push_back(RISCVOperand::createVType(VTypeI, S));		Operands.push_back(RISCVOperand::createVType(VTypeI, S));
▲ Show 20 Lines • Show All 1,412 Lines • Show Last 20 Lines

llvm/test/MC/RISCV/rvi-pseudos.s

	Show First 20 Lines • Show All 223 Lines • ▼ Show 20 Lines
	# CHECK: addi a5, a5, %pcrel_lo(.Lpcrel_hi36)			# CHECK: addi a5, a5, %pcrel_lo(.Lpcrel_hi36)
	lla a5, (0xFF + a_symbol) - 4			lla a5, (0xFF + a_symbol) - 4

	## Check that we don't double-parse a top-level minus.			## Check that we don't double-parse a top-level minus.
	# CHECK: .Lpcrel_hi37:			# CHECK: .Lpcrel_hi37:
	# CHECK: auipc a5, %pcrel_hi(a_symbol-4)			# CHECK: auipc a5, %pcrel_hi(a_symbol-4)
	# CHECK: addi a5, a5, %pcrel_lo(.Lpcrel_hi37)			# CHECK: addi a5, a5, %pcrel_lo(.Lpcrel_hi37)
	lla a5, a_symbol - 4			lla a5, a_symbol - 4

				# CHECK: .Lpcrel_hi38:
				# CHECK: auipc a5, %pcrel_hi(a_symbol-5)
				# CHECK: addi a5, a5, %pcrel_lo(.Lpcrel_hi38)
				//lla //a5//, //a_symbol// - //5/**/

				# CHECK: .Lpcrel_hi39:
				# CHECK: auipc a5, %pcrel_hi(a_symbol+5)
				# CHECK: addi a5, a5, %pcrel_lo(.Lpcrel_hi39)
				//lla //a5//, //a_symbol// + //5/**/

llvm/test/MC/RISCV/rvv/vsetvl.s

	Show First 20 Lines • Show All 148 Lines • ▼ Show 20 Lines
	# CHECK-UNKNOWN: 57 76 00 cd <unknown>			# CHECK-UNKNOWN: 57 76 00 cd <unknown>

	vsetivli a2, 15, e32, m1, ta, ma			vsetivli a2, 15, e32, m1, ta, ma
	# CHECK-INST: vsetivli a2, 15, e32, m1, ta, ma			# CHECK-INST: vsetivli a2, 15, e32, m1, ta, ma
	# CHECK-ENCODING: [0x57,0xf6,0x07,0xcd]			# CHECK-ENCODING: [0x57,0xf6,0x07,0xcd]
	# CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for Application Processors), 'Zve32x' or 'Zve64x' (Vector Extensions for Embedded Processors){{$}}			# CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for Application Processors), 'Zve32x' or 'Zve64x' (Vector Extensions for Embedded Processors){{$}}
	# CHECK-UNKNOWN: 57 f6 07 cd <unknown>			# CHECK-UNKNOWN: 57 f6 07 cd <unknown>

	vsetivli a2, 31, e32, m1, ta, ma			//vsetivli //a2//, //31//, //e32//, //m1//, //ta//, //ma/**/
	# CHECK-INST: vsetivli a2, 31, e32, m1, ta, ma			# CHECK-INST: vsetivli a2, 31, e32, m1, ta, ma
	# CHECK-ENCODING: [0x57,0xf6,0x0f,0xcd]			# CHECK-ENCODING: [0x57,0xf6,0x0f,0xcd]
	# CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for Application Processors), 'Zve32x' or 'Zve64x' (Vector Extensions for Embedded Processors){{$}}			# CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for Application Processors), 'Zve32x' or 'Zve64x' (Vector Extensions for Embedded Processors){{$}}
	# CHECK-UNKNOWN: 57 f6 0f cd <unknown>			# CHECK-UNKNOWN: 57 f6 0f cd <unknown>