This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/AsmParser/
-
Target/
-
RISCV/
-
AsmParser/
4/4
RISCVAsmParser.cpp
-
test/MC/RISCV/
-
MC/
-
RISCV/
-
comments-zdinx.ll
1/1
comments.ll

Differential D153008

[RISCV] Allow slash-star comments in instruction operands
Needs ReviewPublic

Authored by abel-bernabeu on Jun 15 2023, 3:45 AM.

Download Raw Diff

Details

Reviewers

asb
rogfer01
DavidSpickett
jrtc27
craig.topper
• s-barannikov
barannikov88
abel-bernabeu

Summary

The following commenting style was unsupported, leading to compilation
errors:

```
unsigned long int dst;
__asm__ __volatile__(
  "li /* this was fine */ %[dst], /* this was NOT fine */ 0x1234\n"
  "add zero, %[dst], %[dst]\n"
  : [ dst ] "=r"(dst)
  :
  :);
```

A code review of the top level parser (AsmParser class) showed that
it was the backend's responsibility to handle the comments, but
RISC-V's backend did not handle the comments in any way.

Beyond the obvious solution of explicitly handling the comments within
the RISC-V backend, another, easier to maintain, was suggested by Sergei
Barannikov in a Discourse discussion thread:

https://discourse.llvm.org/t/interleaving-several-c-style-comments-in-the-same-inline-assembly-line/71353/8

In summary, all backends, including the RISC-V's, should switch
from getLexer().Lex() to getParser().Lex() in their ParseInstruction
implementation. Here we just do the RISC-V work.

We would also like to mention David Spikett from Arm's community for pointing
out where to start looking within the LLVM code base.

Co-Authored-By: Sergei Barannikov <barannikov88@gmail.com>
Co-Authored-By: Fangrui Song <i@maskray.me>

Differential Revision: https://reviews.llvm.org/D153008

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

abel-bernabeu created this revision.Jun 15 2023, 3:45 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2023, 3:45 AM

Herald added subscribers: jobnoorman, luke, VincentWu and 29 others. · View Herald Transcript

abel-bernabeu requested review of this revision.Jun 15 2023, 3:45 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 15 2023, 3:45 AM

Herald added subscribers: llvm-commits, cfe-commits, • pcwang-thead and 2 others. · View Herald Transcript

abel-bernabeu edited the summary of this revision. (Show Details)Jun 15 2023, 3:48 AM

abel-bernabeu added a reviewer: asb.Jun 15 2023, 3:53 AM

abel-bernabeu edited the summary of this revision. (Show Details)

abel-bernabeu edited the summary of this revision. (Show Details)Jun 15 2023, 3:57 AM

abel-bernabeu added a reviewer: rogfer01.

Is the comment here relevant? https://discourse.llvm.org/t/interleaving-several-c-style-comments-in-the-same-inline-assembly-line/71353/8 Does this patch do that already?

Also is it a problem that the ignored comments are not seen in the output? Perhaps you are just not checking for those bits. For comments on the end, those are propagated to the assembly output I know that much.

For the customer who reported the problem, the comments are in the input source (doing their job explaining what the operands are).

Now, if a comment is not seen when compiling with "-S" it is less of a problem than having the compilation not succeeding. I did not see an obvious way to pass those comments to the AsmParser instance.

My understanding is that the ParseInstruction interface needs to be extended for the comments to be collected and passed to the top level parser (similarly to what is done with the operands).

Harbormaster completed remote builds in B239074: Diff 531684.Jun 15 2023, 5:06 AM

Hi @abel-bernabeu, I did a quick experiment replacing all the calls to getLexer().Lex() with getParser().Lex() and now your second testcase is accepted and looks like this in the output

f2:
	addi	sp, sp, -32
	sd	ra, 24(sp)
	sd	s0, 16(sp)
	addi	s0, sp, 32
	#APP
	lui	a0, 1	# this is fine 	# this should also be fine 
	addiw	a0, a0, 564
	add	zero, a0, a0

	#NO_APP

I understand this is the best we can do within assembly syntax.

I'm not suggesting we should change all of them. There are ~20 occurrences of getLexer.Lex() so I think it should not be too onerous to review each one.

What do you think?

[RISCV] Allow slash-star comments in instruction operands

It has been reported by one of Esperanto's customers that slash-start
comments ("/*") within inline assembly were only allowed before the
first instruction operand or at the end of the lines. Those comments
were, however, not allowed when interleaved within the operands.

An example follows:

unsigned long int dst;
__asm__ __volatile__(
  "li /* this was fine */ %[dst], /* this was NOT fine */ 0x1234\n"
  "add zero, %[dst], %[dst]\n"
  : [ dst ] "=r"(dst)
  :
  :);

A code review of the top level parser (AsmParser class) showed that
when comments were placed before the instruction operand or at end of
a line, then they were gracefully handled irrespective of the backend.
When the comments were interleaved within the instruction operands it
was the backend's responsibility to handle the comments.

RISC-V's backend did not handle the comments in any way.

Beyond the obvious solution of explicitly handling the comments within
the RISC-V backend, another, easier to maintain solution was suggested
by Sergei Barannikov in a Discourse discussion thread:

https://discourse.llvm.org/t/interleaving-several-c-style-comments-in-the-same-inline-assembly-line/71353/8

In summary, all backends, including the RISC-V's, should switch
from getLexer().Lex() to getParser().Lex() in their ParseInstruction
implementation.

The getLexer().Lex() approach relies on the user to explicitly handle
the comments, whereas the suggested getParser().Lex() alternive already
handles the comments in the same way as done for non-target-specific
assembly directives.

Here we just do the RISC-V work. Other backends should also do their own
review.

In addition to Sergei Barannikov, I would also we thank David Spikett
from Arm's community for pointing out where to start looking within the
LLVM code base, and also the patch reviewers.

abel-bernabeu edited the summary of this revision. (Show Details)Jun 15 2023, 7:00 AM

[RISCV] Allow slash-star comments in instruction operands

It has been reported by one of Esperanto's customers that slash-start
comments ("/*") within inline assembly were only allowed before the
first instruction operand or at the end of the lines. Those comments
were, however, not allowed when interleaved within the operands.

An example follows:

unsigned long int dst;
__asm__ __volatile__(
  "li /* this was fine */ %[dst], /* this was NOT fine */ 0x1234\n"
  "add zero, %[dst], %[dst]\n"
  : [ dst ] "=r"(dst)
  :
  :);

A code review of the top level parser (AsmParser class) showed that
when comments were placed before the instruction operand or at end of
a line, then they were gracefully handled irrespective of the backend.
When the comments were interleaved within the instruction operands it
was the backend's responsibility to handle the comments.

RISC-V's backend did not handle the comments in any way.

Beyond the obvious solution of explicitly handling the comments within
the RISC-V backend, another, easier to maintain solution was suggested
by Sergei Barannikov in a Discourse discussion thread:

https://discourse.llvm.org/t/interleaving-several-c-style-comments-in-the-same-inline-assembly-line/71353/8

In summary, all backends, including the RISC-V's, should switch
from getLexer().Lex() to getParser().Lex() in their ParseInstruction
implementation.

The getLexer().Lex() approach relies on the user to explicitly handle
the comments, whereas the suggested getParser().Lex() alternive already
handles the comments in the same way as done for non-target-specific
assembly directives.

Here we just do the RISC-V work. Other backends should also do their own
review.

In addition to Sergei Barannikov, I would also we thank David Spikett
from Arm's community for pointing out where to start looking within the
LLVM code base, and also the patch reviewers.

abel-bernabeu edited the summary of this revision. (Show Details)Jun 15 2023, 7:01 AM

I was a bit reluctant to start a discussion on migrating getLexer().Lex() to getParser().Lex(), but I guess it makes sense to do it now rather than deferring.

It is cleaner now, I agree.

[RISCV] Allow slash-star comments in instruction operands

It has been reported by one of Esperanto's customers that slash-start
comments ("/*") within inline assembly were only allowed before the
first instruction operand or at the end of the lines. Those comments
were, however, not allowed when interleaved within the operands.

An example follows:

```
unsigned long int dst;
__asm__ __volatile__(
  "li /* this was fine */ %[dst], /* this was NOT fine */ 0x1234\n"
  "add zero, %[dst], %[dst]\n"
  : [ dst ] "=r"(dst)
  :
  :);
```

A code review of the top level parser (AsmParser class) showed that
when comments were placed before the instruction operand or at end of
a line, then they were gracefully handled irrespective of the backend.
When the comments were interleaved within the instruction operands it
was the backend's responsibility to handle the comments.

RISC-V's backend did not handle the comments in any way.

Beyond the obvious solution of explicitly handling the comments within
the RISC-V backend, another, easier to maintain solution was suggested
by Sergei Barannikov in a Discourse discussion thread:

https://discourse.llvm.org/t/interleaving-several-c-style-comments-in-the-same-inline-assembly-line/71353/8

In summary, all backends, including the RISC-V's, should switch
from getLexer().Lex() to getParser().Lex() in their ParseInstruction
implementation.

The getLexer().Lex() approach relies on the user to explicitly handle
the comments, whereas the suggested getParser().Lex() alternive already
handles the comments in the same way as done for non-target-specific
assembly directives.

Here we just do the RISC-V work. Other backends should also do their own
review.

In addition to Sergei Barannikov, I would also like to thank David Spikett
from Arm's community for pointing out where to start looking within the
LLVM code base, and also the patch reviewers.

abel-bernabeu edited the summary of this revision. (Show Details)Jun 15 2023, 7:09 AM

[RISCV] Allow slash-star comments in instruction operands

An example follows:

unsigned long int dst;
__asm__ __volatile__(
  "li /* this was fine */ %[dst], /* this was NOT fine */ 0x1234\n"
  "add zero, %[dst], %[dst]\n"
  : [ dst ] "=r"(dst)
  :
  :);

RISC-V's backend did not handle the comments in any way.

Beyond the obvious solution of explicitly handling the comments within
the RISC-V backend, another, easier to maintain, was suggested by Sergei
Barannikov in a Discourse discussion thread:

https://discourse.llvm.org/t/interleaving-several-c-style-comments-in-the-same-inline-assembly-line/71353/8

In summary, all backends, including the RISC-V's, should switch
from getLexer().Lex() to getParser().Lex() in their ParseInstruction
implementation.

Here we just do the RISC-V work. Other backends should also do their own
review.

In addition to Sergei Barannikov, I would also like to thank David Spikett
from Arm's community for pointing out where to start looking within the
LLVM code base, and also the patch reviewers.

abel-bernabeu edited the summary of this revision. (Show Details)Jun 15 2023, 7:14 AM

DavidSpickett added inline comments.Jun 15 2023, 7:18 AM

clang/test/CodeGen/RISCV/riscv-inline-asm-gcc-commenting.c
23 ↗	(On Diff #531746)	Check for the comment content here too? `# this is fine # this should also be fine`

Does the test have to be a C source? Why not just plain asm (fed into llvm-mc)?

PS
Please don't repeat the commin message on each arc diff, it is noisy. Just write what has been changed since the last revision, e.g. "Address review comments" or "Apply clang-format".

Herald added a subscriber: wangpc. · View Herald TranscriptJun 15 2023, 7:43 AM

Clang tests should not compile to asm. You want an IR test.

This revision now requires changes to proceed.Jun 15 2023, 7:51 AM

Harbormaster completed remote builds in B239125: Diff 531746.Jun 15 2023, 9:15 AM

In D153008#4424821, @jrtc27 wrote:

Clang tests should not compile to asm. You want an IR test.

Jessica, are there any exceptions for tests is under CodeGen/RISCV intended to exercise the assembly parser?

I have just written a test that reproduces the way I manually test the feature.

In D153008#4425238, @abel-bernabeu wrote:

In D153008#4424821, @jrtc27 wrote:

Clang tests should not compile to asm. You want an IR test.

Jessica, are there any exceptions for tests is under CodeGen/RISCV intended to exercise the assembly parser?

I have just written a test that reproduces the way I manually test the feature.

No. You can test the assembly parser just as easily from IR.

I'll also note that your assembly isn't particularly minimal, which it should be, unless the add zero, %[dst], %[dst] lines are doing something I'm not aware of?

Thanks everyone, taking note of all the comments for improving the test:

Simplify the test (can be one instruction, no problem).
Check at IR level
Check for the comments being placed in the output
Do "arc diff" with one-line descriptions

Will update before tomorrow at mid-day (CET).

You may add a test file beside llvm/test/CodeGen/RISCV/inline-asm*.
Perhaps llvm/test/MC/RISCV/ is a better choice.

I hope that this patch focuses on getLexer().Lex() instances that actually cause a problem.

Many getLexer().Lex() instances followed by is(...) form a pattern that can be rewritten in a better way. I'll check these instances and fix them.

Simplified the test. Also I check that the comments are a carried over to the assembly output.

This new update still applies many unneeded getParser().Lex(); and adds a test at a wrong layer (clang/test/CodeGen):

In D153008#4425244, @jrtc27 wrote:

In D153008#4425238, @abel-bernabeu wrote:

In D153008#4424821, @jrtc27 wrote:

Clang tests should not compile to asm. You want an IR test.

Jessica, are there any exceptions for tests is under CodeGen/RISCV intended to exercise the assembly parser?

I have just written a test that reproduces the way I manually test the feature.

No. You can test the assembly parser just as easily from IR.

I'll also note that your assembly isn't particularly minimal, which it should be, unless the add zero, %[dst], %[dst] lines are doing something I'm not aware of?

Jessica, the LLVM IR form contains the inline assembly still unparsed. Look, this is what I get:

abel@Docker:~/work/llvm-project/build$ ./bin/clang -fPIC --target=riscv64-unknown-elf   -o - -S  -emit-llvm  ../clang/test/CodeGen/RISCV/riscv-inline-asm-gcc-commenting.c 
; ModuleID = '../clang/test/CodeGen/RISCV/riscv-inline-asm-gcc-commenting.c'
source_filename = "../clang/test/CodeGen/RISCV/riscv-inline-asm-gcc-commenting.c"
target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "riscv64-unknown-unknown-elf"

; Function Attrs: noinline nounwind optnone
define i64 @f() #0 {
  %1 = alloca i64, align 8
  %2 = call i64 asm "li /* this is fine */ $0 , /* this is also fine */ 0 /* and last but not least */\0A", "=r"() #1, !srcloc !6
  store i64 %2, ptr %1, align 8
  %3 = load i64, ptr %1, align 8
  ret i64 %3
}

attributes #0 = { noinline nounwind optnone "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic-rv64" "target-features"="+64bit,+a,+c,+m,+relax,-d,-e,-experimental-smaia,-experimental-ssaia,-experimental-zca,-experimental-zcb,-experimental-zcd,-experimental-zcf,-experimental-zcmp,-experimental-zcmt,-experimental-zfa,-experimental-zfbfmin,-experimental-zicond,-experimental-zihintntl,-experimental-ztso,-experimental-zvbb,-experimental-zvbc,-experimental-zvfbfmin,-experimental-zvfbfwma,-experimental-zvfh,-experimental-zvkg,-experimental-zvkn,-experimental-zvknc,-experimental-zvkned,-experimental-zvkng,-experimental-zvknha,-experimental-zvknhb,-experimental-zvks,-experimental-zvksc,-experimental-zvksed,-experimental-zvksg,-experimental-zvksh,-experimental-zvkt,-f,-h,-save-restore,-svinval,-svnapot,-svpbmt,-v,-xsfvcp,-xtheadba,-xtheadbb,-xtheadbs,-xtheadcmo,-xtheadcondmov,-xtheadfmemidx,-xtheadmac,-xtheadmemidx,-xtheadmempair,-xtheadsync,-xtheadvdot,-xventanacondops,-zawrs,-zba,-zbb,-zbc,-zbkb,-zbkc,-zbkx,-zbs,-zdinx,-zfh,-zfhmin,-zfinx,-zhinx,-zhinxmin,-zicbom,-zicbop,-zicboz,-zicntr,-zicsr,-zifencei,-zihintpause,-zihpm,-zk,-zkn,-zknd,-zkne,-zknh,-zkr,-zks,-zksed,-zksh,-zkt,-zmmul,-zve32f,-zve32x,-zve64d,-zve64f,-zve64x,-zvl1024b,-zvl128b,-zvl16384b,-zvl2048b,-zvl256b,-zvl32768b,-zvl32b,-zvl4096b,-zvl512b,-zvl64b,-zvl65536b,-zvl8192b" }
attributes #1 = { nounwind memory(none) }

!llvm.module.flags = !{!0, !1, !2, !3, !4}
!llvm.ident = !{!5}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 1, !"target-abi", !"lp64"}
!2 = !{i32 8, !"PIC Level", i32 2}
!3 = !{i32 7, !"frame-pointer", i32 2}
!4 = !{i32 8, !"SmallDataLimit", i32 0}
!5 = !{!"clang version 17.0.0 (git@github.com:llvm/llvm-project.git 61bab164d4c3b15ba13ddd53de7bdeb6b8c9de30)"}
!6 = !{i64 139}

Only the substitution of "%[dst]" for "$0" is observable on the LLVM IR.

Am I misinterpreting your suggestion? Thanks in advance for you clarification.

In D153008#4428568, @MaskRay wrote:

This new update still applies many unneeded getParser().Lex(); and adds a test at a wrong layer (clang/test/CodeGen):

Your comment in here suggests a full review of the parser to make the code consistent with the principle that a token should be eaten in the same function that checks for a specific pattern.

https://discourse.llvm.org/t/interleaving-several-c-style-comments-in-the-same-inline-assembly-line/71353/10

I like the suggestion, just a couple of questions:

is it something that you would like to try yourself?
do you want me to create a refactoring ticket for myself? Not sure when will I be highly available for a major refactor, but I can try to assign time at work for the task.

Let me know your preference.

In D153008#4428568, @MaskRay wrote:

This new update still applies many unneeded getParser().Lex(); and adds a test at a wrong layer (clang/test/CodeGen):

Would you rather have:

LLVM IR as input
assembly as output
the test placed under llvm/test/CodeGen/RISCV/

That would make the test more verbose, but it would reduce the testing scope. Please confirm your preference.

Harbormaster completed remote builds in B239446: Diff 532185.Jun 16 2023, 10:23 AM

The test has been moved to llvm/test/CodeGen/RISCV

Also has been reworked for using LLVM IR as input, so clang is not needed in the loop.

In D153008#4424821, @jrtc27 wrote:

Clang tests should not compile to asm. You want an IR test.

My bad. You suggested IR as input (rather than C) and makes total sense. The latest update captures that suggestion.

Harbormaster completed remote builds in B239505: Diff 532268.Jun 16 2023, 2:47 PM

In D153008#4428694, @abel-bernabeu wrote:

In D153008#4428568, @MaskRay wrote:

This new update still applies many unneeded getParser().Lex(); and adds a test at a wrong layer (clang/test/CodeGen):

Would you rather have:

LLVM IR as input

assembly as output

the test placed under llvm/test/CodeGen/RISCV/

?

That would make the test more verbose, but it would reduce the testing scope. Please confirm your preference.

I created D153204 as an alternative. Thanks for reporting the issue!

abel-bernabeu mentioned this in D153204: RISCVAsmParser: support comments in more places.Jun 17 2023, 5:09 PM

In D153008#4430491, @MaskRay wrote:

In D153008#4428694, @abel-bernabeu wrote:

In D153008#4428568, @MaskRay wrote:

This new update still applies many unneeded getParser().Lex(); and adds a test at a wrong layer (clang/test/CodeGen):

Would you rather have:

LLVM IR as input

assembly as output

the test placed under llvm/test/CodeGen/RISCV/

?

That would make the test more verbose, but it would reduce the testing scope. Please confirm your preference.

I created D153204 as an alternative. Thanks for reporting the issue!

Thanks for reviewing. Added my comments on your patch proposal.

abel-bernabeu added a reviewer: craig.topper.Jun 18 2023, 12:28 PM

abel-bernabeu added reviewers: • s-barannikov, barannikov88.Jun 18 2023, 12:52 PM

abel-bernabeu edited the summary of this revision. (Show Details)Jun 18 2023, 3:06 PM

Updated commit message:

fixed a typo
added a co-author

abel-bernabeu edited the summary of this revision. (Show Details)Jun 18 2023, 3:16 PM

Fixed a typo on commit message

abel-bernabeu edited the summary of this revision. (Show Details)Jun 18 2023, 3:19 PM

Harbormaster completed remote builds in B239682: Diff 532489.Jun 18 2023, 4:43 PM

I'd prefer Lex() over equivalent getParser().Lex() because it is shorter.
Ideally, every change should be tested.
The tests should be minimal. That is, they should be assembly files passed to llvm-mc rather than ll files passed to llc.

I'm not very familiar with RISC-V, so I'll leave further review to code owners.

Rebased on top of the latest changes in RISCVAsmParser.cpp

Moved the testing to llc-mc test cases under lvm/test/MC/RISCV/

Covered with tests for every single case where the parser consumes a
token and a potential comment needs to be handled. Having manually
verified that the testing coverage is complete.

Reworked the code using getLexer().peekTokens for ignoring comments.

If a comment is not handled proper, now is a bug, because the feature
is complete and the testing has full coverage.

abel-bernabeu edited the summary of this revision. (Show Details)Jun 19 2023, 9:04 AM

Harbormaster completed remote builds in B239830: Diff 532686.Jun 19 2023, 10:02 AM

abel-bernabeu added inline comments.Jun 19 2023, 12:27 PM

llvm/test/MC/RISCV/comments.ll
26	This should be CHECK-NEXT rather than CHECK. Will update again tonight.

Added --match-full-lines to FileCheck for making the check stricter.

Removed a gratuitous blank line

Am happy with the current patch version

On the zdinx test, changed a FileCheck comment that was written starting with "//" rather than "#".

Changed for "#" just for consistency with the rest of tests.

abel-bernabeu accepted this revision.Jun 19 2023, 2:54 PM

@jrtc27:

The tests:

are under llvm/test/MC/RISCV
use llvm-mc as assembler (rather than C!)
are as simple as they can be
cover every line touched by this patch.

Would you unblock this now? Thanks for your feedback so far.

@MaskRay:

Let me know if you miss any testing or places where comments are still not possible. Am pretty sure I am covering every case.

You have been added as coauthor here (with the standard "Co-Authored-By: " tag on the commit message) and you can close https://reviews.llvm.org/D153204 in return :)

Harbormaster completed remote builds in B239889: Diff 532758.Jun 19 2023, 3:30 PM

I just got this flashback...

For code coverage of one of the Lex() calls in ParseInstruction we needed one case
where an instruction has no operands.

We had, not one, but three. Am keeping just the "nop" case, which is the simplest
and most known instruction and moving it to the top of the test suite, so the tests
are sorted by increasing complexity (from the simplest to reach test points to
the most sophisticated and least frequently used parser features).

Harbormaster completed remote builds in B239949: Diff 532837.Jun 20 2023, 4:20 AM

craig.topper added inline comments.Jun 21 2023, 12:23 AM

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
1586	I don't think this style of comment is common in llvm.
1598	Can this be a regular array instead of a SmallVector?
1601	Capitalize variable names

barannikov88 added inline comments.Jun 21 2023, 1:14 AM

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
1610	This is much more common.

Addressed some review comments:

Doxygen-style comments are not common, so using triple-slash instead.
Using a regular C array on a place where the usage of SmallVector is not justified.
Proper capitalization of a variable
Usage of "&&" rather than "and", to blend better with the existing code style.

abel-bernabeu signed these changes with MFA.Jun 21 2023, 2:42 AM

abel-bernabeu accepted this revision.

abel-bernabeu marked 4 inline comments as done.

Reduced the commit message verbosity by one half, without losing anything.

Even less verbosity...

abel-bernabeu edited the summary of this revision. (Show Details)Jun 21 2023, 2:51 AM

abel-bernabeu edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B240198: Diff 533194.Jun 21 2023, 3:25 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

AsmParser/

RISCVAsmParser.cpp

77 lines

test/

MC/

RISCV/

comments-zdinx.ll

4 lines

comments.ll

67 lines

Diff 532754

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp

Show First 20 Lines • Show All 101 Lines • ▼ Show 20 Lines

class RISCVAsmParser : public MCTargetAsmParser {

bool generateImmOutOfRangeError(SMLoc ErrorLoc, int64_t Lower, int64_t Upper,

const Twine &Msg);

bool MatchAndEmitInstruction(SMLoc IDLoc, unsigned &Opcode,

OperandVector &Operands, MCStreamer &Out,

uint64_t &ErrorInfo,

bool MatchingInlineAsm) override;

AsmToken peekNextNext();

bool parseRegister(MCRegister &RegNo, SMLoc &StartLoc,

SMLoc &EndLoc) override;

OperandMatchResultTy tryParseRegister(MCRegister &RegNo, SMLoc &StartLoc,

SMLoc &EndLoc) override;

bool ParseInstruction(ParseInstructionInfo &Info, StringRef Name,

SMLoc NameLoc, OperandVector &Operands) override;

▲ Show 20 Lines • Show All 1,454 Lines • ▼ Show 20 Lines

OperandMatchResultTy RISCVAsmParser::tryParseRegister(MCRegister &RegNo,

StartLoc = Tok.getLoc();

EndLoc = Tok.getEndLoc();

StringRef Name = getLexer().getTok().getIdentifier();

RegNo = matchRegisterNameHelper(isRVE(), Name);

if (!RegNo)

return MatchOperand_NoMatch;

getParser().Lex(); // Eat identifier token.

Lex(); // Eat identifier token.

return MatchOperand_Success;

}

/**

craig.topperUnsubmitted

Done

I don't think this style of comment is common in llvm.

craig.topper: I don't think this style of comment is common in llvm.

\brief Peeks next after next token

This function looks ahead for the next after the next token, discarding

comments in the process. Proper handling of the discarded comments will

happen on the next Lex() call.

AsmToken RISCVAsmParser::peekNextNext() {

AsmToken NextNextToken;

size_t NonComments = 0;

size_t ReadCount;

do {

SmallVector<AsmToken> Buf(10);

craig.topperUnsubmitted

Done

Can this be a regular array instead of a SmallVector?

craig.topper: Can this be a regular array instead of a SmallVector?

ReadCount = getLexer().peekTokens(Buf);

for (size_t index = 0; index < ReadCount; ++index) {

AsmToken token = Buf[index];

craig.topperUnsubmitted

Done

Capitalize variable names

craig.topper: Capitalize variable names

if (token.getKind() != AsmToken::Comment) {

NonComments++;

if (NonComments == 2) {

NextNextToken = token;

break;

}

} while (NonComments < 2 and ReadCount > 0);

barannikov88Unsubmitted

Done

}

- } while (NonComments < 2 and ReadCount > 0);

+ } while (NonComments < 2 && ReadCount > 0);

return NextNextToken;

This is much more common.

barannikov88: This is much more common.

return NextNextToken;

}

OperandMatchResultTy RISCVAsmParser::parseRegister(OperandVector &Operands,

bool AllowParens) {

SMLoc FirstS = getLoc();

bool HadParens = false;

AsmToken LParen;

// If this is an LParen and a parenthesised register name is allowed, parse it

// atomically.

if (AllowParens && getLexer().is(AsmToken::LParen)) {

AsmToken Buf[2];

AsmToken NextNextToken = peekNextNext();

size_t ReadCount = getLexer().peekTokens(Buf);

if (NextNextToken.getKind() == AsmToken::RParen) {

if (ReadCount == 2 && Buf[1].getKind() == AsmToken::RParen) {

HadParens = true;

LParen = getParser().getTok();

getParser().Lex(); // Eat '('

Lex(); // Eat '('

}

switch (getLexer().getKind()) {

default:

if (HadParens)

getLexer().UnLex(LParen);

return MatchOperand_NoMatch;

case AsmToken::Identifier:

StringRef Name = getLexer().getTok().getIdentifier();

MCRegister RegNo = matchRegisterNameHelper(isRVE(), Name);

if (!RegNo) {

if (HadParens)

getLexer().UnLex(LParen);

return MatchOperand_NoMatch;

}

if (HadParens)

Operands.push_back(RISCVOperand::createToken("(", FirstS));

SMLoc S = getLoc();

SMLoc E = SMLoc::getFromPointer(S.getPointer() + Name.size());

getLexer().Lex();

Lex();

Operands.push_back(RISCVOperand::createReg(RegNo, S, E));

}

if (HadParens) {

getParser().Lex(); // Eat ')'

Lex(); // Eat ')'

Operands.push_back(RISCVOperand::createToken(")", getLoc()));

}

return MatchOperand_Success;

}

OperandMatchResultTy

RISCVAsmParser::parseInsnDirectiveOpcode(OperandVector &Operands) {

▲ Show 20 Lines • Show All 286 Lines • ▼ Show 20 Lines

RISCVAsmParser::parseOperandWithModifier(OperandVector &Operands) {

}

StringRef Identifier = getParser().getTok().getIdentifier();

RISCVMCExpr::VariantKind VK = RISCVMCExpr::getVariantKindForName(Identifier);

if (VK == RISCVMCExpr::VK_RISCV_Invalid) {

Error(getLoc(), "unrecognized operand modifier");

return MatchOperand_ParseFail;

}

getParser().Lex(); // Eat the identifier

Lex(); // Eat the identifier

if (parseToken(AsmToken::LParen, "expected '('"))

return MatchOperand_ParseFail;

const MCExpr *SubExpr;

if (getParser().parseParenExpression(SubExpr, E)) {

return MatchOperand_ParseFail;

}

Show All 36 Lines

OperandMatchResultTy RISCVAsmParser::parseBareSymbol(OperandVector &Operands) {

MCBinaryExpr::Opcode Opcode;

switch (getLexer().getKind()) {

default:

Operands.push_back(RISCVOperand::createImm(Res, S, E, isRV64()));

return MatchOperand_Success;

case AsmToken::Plus:

Opcode = MCBinaryExpr::Add;

getLexer().Lex();

Lex();

break;

case AsmToken::Minus:

Opcode = MCBinaryExpr::Sub;

getLexer().Lex();

Lex();

break;

}

const MCExpr *Expr;

if (getParser().parseExpression(Expr, E))

return MatchOperand_ParseFail;

Res = MCBinaryExpr::create(Opcode, Res, Expr, getContext());

Operands.push_back(RISCVOperand::createImm(Res, S, E, isRV64()));

▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines

if (getLexer().isNot(AsmToken::Identifier))

return MatchOperand_NoMatch;

StringRef Identifier = getTok().getIdentifier();

if (parseVTypeToken(Identifier, State, Sew, Lmul, Fractional, TailAgnostic,

MaskAgnostic))

return MatchOperand_NoMatch;

getLexer().Lex();

Lex();

while (parseOptionalToken(AsmToken::Comma)) {

if (getLexer().isNot(AsmToken::Identifier))

break;

Identifier = getTok().getIdentifier();

if (parseVTypeToken(Identifier, State, Sew, Lmul, Fractional, TailAgnostic,

MaskAgnostic))

break;

getLexer().Lex();

Lex();

}

if (getLexer().is(AsmToken::EndOfStatement) && State == VTypeState_Done) {

RISCVII::VLMUL VLMUL = RISCVVType::encodeLMUL(Lmul, Fractional);

unsigned VTypeI =

RISCVVType::encodeVTYPE(VLMUL, Sew, TailAgnostic, MaskAgnostic);

Operands.push_back(RISCVOperand::createVType(VTypeI, S));

Show All 23 Lines

OperandMatchResultTy RISCVAsmParser::parseMaskReg(OperandVector &Operands) {

MCRegister RegNo = matchRegisterNameHelper(isRVE(), Name);

if (!RegNo)

return MatchOperand_NoMatch;

if (RegNo != RISCV::V0)

return MatchOperand_NoMatch;

SMLoc S = getLoc();

SMLoc E = SMLoc::getFromPointer(S.getPointer() + Name.size());

getLexer().Lex();

Lex();

Operands.push_back(RISCVOperand::createReg(RegNo, S, E));

return MatchOperand_Success;

}

OperandMatchResultTy RISCVAsmParser::parseGPRAsFPR(OperandVector &Operands) {

if (getLexer().isNot(AsmToken::Identifier))

return MatchOperand_NoMatch;

StringRef Name = getLexer().getTok().getIdentifier();

MCRegister RegNo = matchRegisterNameHelper(isRVE(), Name);

if (!RegNo)

return MatchOperand_NoMatch;

SMLoc S = getLoc();

SMLoc E = SMLoc::getFromPointer(S.getPointer() + Name.size());

getLexer().Lex();

Lex();

Operands.push_back(RISCVOperand::createReg(

RegNo, S, E, !getSTI().hasFeature(RISCV::FeatureStdExtF)));

return MatchOperand_Success;

}

OperandMatchResultTy RISCVAsmParser::parseFRMArg(OperandVector &Operands) {

if (getLexer().isNot(AsmToken::Identifier)) {

TokError("operand must be a valid floating point rounding mode mnemonic");

▲ Show 20 Lines • Show All 172 Lines • ▼ Show 20 Lines

OperandMatchResultTy RISCVAsmParser::parseReglist(OperandVector &Operands) {

StringRef RegName = getLexer().getTok().getIdentifier();

MCRegister RegStart = matchRegisterNameHelper(IsEABI, RegName);

MCRegister RegEnd;

if (RegStart != RISCV::X1) {

Error(getLoc(), "register list must start from 'ra' or 'x1'");

return MatchOperand_ParseFail;

}

getLexer().Lex();

Lex();

// parse case like ,s0

if (parseOptionalToken(AsmToken::Comma)) {

if (getLexer().isNot(AsmToken::Identifier)) {

Error(getLoc(), "invalid register");

return MatchOperand_ParseFail;

}

StringRef RegName = getLexer().getTok().getIdentifier();

RegStart = matchRegisterNameHelper(IsEABI, RegName);

if (!RegStart) {

Error(getLoc(), "invalid register");

return MatchOperand_ParseFail;

}

if (RegStart != RISCV::X8) {

Error(getLoc(), "continuous register list must start from 's0' or 'x8'");

return MatchOperand_ParseFail;

}

getLexer().Lex(); // eat reg

Lex(); // eat reg

}

// parse case like -s1

if (parseOptionalToken(AsmToken::Minus)) {

StringRef EndName = getLexer().getTok().getIdentifier();

// FIXME: the register mapping and checks of EABI is wrong

RegEnd = matchRegisterNameHelper(IsEABI, EndName);

if (!RegEnd) {

Error(getLoc(), "invalid register");

return MatchOperand_ParseFail;

}

if (IsEABI && RegEnd != RISCV::X9) {

Error(getLoc(), "contiguous register list of EABI can only be 's0-s1' or "

"'x8-x9' pair");

return MatchOperand_ParseFail;

}

getLexer().Lex();

Lex();

}

if (!IsEABI) {

// parse extra part like ', x18[-x20]' for XRegList

if (parseOptionalToken(AsmToken::Comma)) {

if (RegEnd != RISCV::X9) {

Error(

getLoc(),

"first contiguous registers pair of register list must be 'x8-x9'");

return MatchOperand_ParseFail;

}

// parse ', x18' for extra part

if (getLexer().isNot(AsmToken::Identifier)) {

Error(getLoc(), "invalid register");

return MatchOperand_ParseFail;

}

StringRef EndName = getLexer().getTok().getIdentifier();

if (MatchRegisterName(EndName) != RISCV::X18) {

Error(getLoc(), "second contiguous registers pair of register list "

"must start from 'x18'");

return MatchOperand_ParseFail;

}

getLexer().Lex();

Lex();

// parse '-x20' for extra part

if (parseOptionalToken(AsmToken::Minus)) {

if (getLexer().isNot(AsmToken::Identifier)) {

Error(getLoc(), "invalid register");

return MatchOperand_ParseFail;

}

EndName = getLexer().getTok().getIdentifier();

if (MatchRegisterName(EndName) == RISCV::NoRegister) {

Error(getLoc(), "invalid register");

return MatchOperand_ParseFail;

}

getLexer().Lex();

Lex();

}

RegEnd = MatchRegisterName(EndName);

}

if (RegEnd == RISCV::X26) {

Error(getLoc(), "invalid register list, {ra, s0-s10} or {x1, x8-x9, "

"x18-x26} is not supported");

Show All 23 Lines

OperandMatchResultTy RISCVAsmParser::parseZcmpSpimm(OperandVector &Operands) {

int64_t StackAdjustment = getLexer().getTok().getIntVal();

unsigned Spimm = 0;

unsigned RlistVal = static_cast<RISCVOperand *>(Operands[1].get())->Rlist.Val;

bool IsEABI = isRVE();

if (!RISCVZC::getSpimm(RlistVal, Spimm, StackAdjustment, isRV64(), IsEABI))

return MatchOperand_NoMatch;

Operands.push_back(RISCVOperand::createSpimm(Spimm << 4, S));

getLexer().Lex();

Lex();

return MatchOperand_Success;

}

/// Looks at a token type and creates the relevant operand from this

/// information, adding to Operands. If operand was parsed, returns false, else

/// true.

bool RISCVAsmParser::parseOperand(OperandVector &Operands, StringRef Mnemonic) {

// Check if the current operand has a custom associated parser, if so, try to

Show All 40 Lines

if (getSTI().hasFeature(RISCV::FeatureRelax)) {

}

// First operand is token for instruction

Operands.push_back(RISCVOperand::createToken(Name, NameLoc));

// If there are no more operands, then finish

if (getLexer().is(AsmToken::EndOfStatement)) {

getParser().Lex(); // Consume the EndOfStatement.

Lex(); // Consume the EndOfStatement.

return false;

}

// Parse first operand

if (parseOperand(Operands, Name))

return true;

// Parse until end of statement, consuming commas between operands

▲ Show 20 Lines • Show All 148 Lines • ▼ Show 20 Lines

do {

Type = RISCVOptionArchArgType::Full;

if (Parser.getTok().isNot(AsmToken::Identifier))

return Error(Parser.getTok().getLoc(),

"unexpected token, expected identifier");

StringRef Arch = Parser.getTok().getString();

SMLoc Loc = Parser.getTok().getLoc();

Parser.Lex();

Lex();

if (Type == RISCVOptionArchArgType::Full) {

std::string Result;

if (resetToArch(Arch, Loc, Result, true))

return true;

Args.emplace_back(Type, Result);

break;

▲ Show 20 Lines • Show All 129 Lines • ▼ Show 20 Lines

if (Parser.getTok().is(AsmToken::Identifier)) {

StringRef Name = Parser.getTok().getIdentifier();

std::optional<unsigned> Ret =

ELFAttrs::attrTypeFromString(Name, RISCVAttrs::getRISCVAttributeTags());

if (!Ret) {

Error(TagLoc, "attribute name not recognised: " + Name);

return false;

}

Tag = *Ret;

Parser.Lex();

Lex();

} else {

const MCExpr *AttrExpr;

TagLoc = Parser.getTok().getLoc();

if (Parser.parseExpression(AttrExpr))

return true;

const MCConstantExpr *CE = dyn_cast<MCConstantExpr>(AttrExpr);

Show All 25 Lines

if (IsIntegerValue) {

if (!CE)

return Error(ValueExprLoc, "expected numeric constant");

IntegerValue = CE->getValue();

} else {

if (Parser.getTok().isNot(AsmToken::String))

return Error(Parser.getTok().getLoc(), "expected string constant");

StringValue = Parser.getTok().getStringContents();

Parser.Lex();

Lex();

}

if (Parser.parseEOL())

return true;

if (IsIntegerValue)

getTargetStreamer().emitAttribute(Tag, IntegerValue);

else if (Tag != RISCVAttrs::ARCH)

▲ Show 20 Lines • Show All 645 Lines • Show Last 20 Lines

llvm/test/MC/RISCV/comments-zdinx.ll

This file was added.

				# RUN: llvm-mc -triple=riscv64 --preserve-comments --mattr=+zdinx %s \| FileCheck %s --match-full-lines

				/c0/ fmadd.d /c1/ x10 /c2/ , /c3/ x12 /c4/ , /c5/ x14 /c6/ , /c7/ x16 /c8/ , /c9/ dyn /c10/
				// CHECK: fmadd.d a0, a2, a4, a6 #c0 #c1 #c2 #c3 #c4 #c5 #c6 #c7 #c8 #c9 #c10

llvm/test/MC/RISCV/comments.ll

This file was added.

				# RUN: llvm-mc -triple=riscv64 --preserve-comments \
				# RUN: --mattr=+v,+experimental-zfa,+experimental-zcmp,+experimental-ztso %s \
				# RUN: \| FileCheck %s --match-full-lines

				/c0/ li /c1/ a0 /c2/ , /c3/ 0 /c4/
				# CHECK: li a0, 0 #c0 #c1 #c2 #c3 #c4

				/c0/ lw /c1/ a0 /c2/ , /c3/ 0 /c4/ ( /c5/ a1 /c6/ ) /c7/
				# CHECK: lw a0, 0(a1) #c0 #c1 #c2 #c3 #c4 #c5 #c6 #c7

				/c0/ lw /c1/ a0 /c2/ , /c3/ ( /c4/ a1 /c5/ ) /c6/
				# CHECK: lw a0, 0(a1) #c0 #c1 #c2 #c3 #c4 #c5 #c6

				/c0/ fli.s /c1/ ft0 /c2/ , /c3/ nan /c4/
				# CHECK: fli.s ft0, nan #c0 #c1 #c2 #c3 #c4

				/c0/ fli.s /c1/ ft0 /c2/ , /c3/ 1.0 /c4/
				# CHECK: fli.s ft0, 1.0 #c0 #c1 #c2 #c3 #c4

				/c0/ auipc /c1/ a0 /c2/ , /c3/ %pcrel_hi /c4/ ( /c5/ a1 /c6/ ) /c7/
				# CHECK: auipc a0, %pcrel_hi(a1) #c0 #c1 #c2 #c3 #c4 #c5 #c6 #c7

				/c0/ la /c1/ s0 /c2/ , /c3/ symbol /c4/ + /c5/ 10 /c6/
				# CHECK: .Lpcrel_hi0: #c0 #c1 #c2 #c3 #c4 #c5 #c6
				# CHECK-NEXT: auipc s0, %pcrel_hi(symbol+10)
				# CHECK-NEXT: addi s0, s0, %pcrel_lo(.Lpcrel_hi0)
				abel-bernabeuAuthorUnsubmitted Done Reply Inline Actions This should be CHECK-NEXT rather than CHECK. Will update again tonight. abel-bernabeu: This should be CHECK-NEXT rather than CHECK. Will update again tonight.

				/c0/ la /c1/ s0 /c2/ , /c3/ symbol /c4/ - /c5/ 10 /c6/
				# CHECK: .Lpcrel_hi1: #c0 #c1 #c2 #c3 #c4 #c5 #c6
				# CHECK-NEXT: auipc s0, %pcrel_hi(symbol-10)
				# CHECK-NEXT: addi s0, s0, %pcrel_lo(.Lpcrel_hi1)

				/c0/ vsetivli /c1/ a2 /c2/ , /c3/ 31 /c4/ , /c5/ e32 /c6/ , /c7/ m1 /c8/ , /c9/ ta /c10/ , /c11/ ma /c12/
				# CHECK: vsetivli a2, 31, e32, m1, ta, ma #c0 #c1 #c2 #c3 #c4 #c5 #c6 #c7 #c8 #c9 #c10 #c11 #c12

				/c0/ vadd.vv /c1/ v1 /c2/ , /c3/ v2 /c4/ , /c5/ v3 /c6/ , /c7/ v0.t /c8/
				# CHECK: vadd.vv v1, v2, v3, v0.t #c0 #c1 #c2 #c3 #c4 #c5 #c6 #c7 #c8

				/c0/ cm.push /c1/ { /c2/ ra /c3/ , /c4/ s0 /c5/ - /c6/ s1 /c7/ } /c8/ , /c9/ - /c10/ 32 /c11/
				# CHECK: cm.push {ra, s0-s1}, -32 #c0 #c1 #c2 #c3 #c4 #c5 #c6 #c7 #c8 #c9 #c10 #c11

				/c0/ cm.popret /c0/ { /c1/ x1 /c2/ , /c3/ x8 /c4/ - /c5/ x9 /c6/ , /c7/ x18 /c8/ - /c9/ x20 /c10/ } /c11/ , /c12/ 64 /c13/
				# CHECK: cm.popret {ra, s0-s4}, 64 #c0 #c0 #c1 #c2 #c3 #c4 #c5 #c6 #c7 #c8 #c9 #c10 #c11 #c12 #c13

				/c0/ fence /c1/ 0 /c2/ , /c3/ 0 /c4/
				# CHECK: fence 0, 0 #c0 #c1 #c2 #c3 #c4

				/c0/ fence /c1/ iorw /c2/ , /c3/ iorw /c4/
				# CHECK: fence #c0 #c1 #c2 #c3 #c4

				/c0/ fence.tso /c1/
				# CHECK: fence.tso #c0 #c1

				/c0/ fence.i /c1/
				# CHECK: fence.i #c0 #c1

				/c0/ nop /c1/
				# CHECK: nop #c0 #c1

				/c0/ .option /c1/ arch /c2/ , /c3/ rv64gc /c4/
				# CHECK: .option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0

				/c0/ .attribute /c1/ priv_spec /c2/ , /c3/ 2 /c4/
				# CHECK: .attribute 8, 2

				/c0/ .attribute /c1/ arch /c2/ , /c3/ "rv32i_zvfbfmin0p6" /c4/
				# CHECK: .attribute 5, "rv32i2p1_f2p2_zicsr2p0_zve32f1p0_zve32x1p0_zvfbfmin0p6_zvl32b1p0"

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Allow slash-star comments in instruction operandsNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 532754

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp

llvm/test/MC/RISCV/comments-zdinx.ll

llvm/test/MC/RISCV/comments.ll

[RISCV] Allow slash-star comments in instruction operands
Needs ReviewPublic