This is an archive of the discontinued LLVM Phabricator instance.

[Clang][LoongArch] Add GPR alias handling without `$` prefix
ClosedPublic

Authored by SixWeining on Oct 21 2022, 2:49 AM.

Details

Summary

Currenlty there is a mismatch between LoongArch gcc and clang about
handling register name in inlineasm, i.e. gcc allows both $-prefixed
and non-prefiexed names for GPRs while clang only allows $-prefixed
one. This patch fixes this mismatch by adding non-prefixed GPR names
in clang.

Take $r4 for example. With this patch, clang accepts $r4, r4,
$a0 and a0 like what gcc does.

Diff Detail

Event Timeline

tangyouling created this revision.Oct 21 2022, 2:49 AM
tangyouling requested review of this revision.Oct 21 2022, 2:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2022, 2:49 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
xen0n added a comment.Oct 21 2022, 2:54 AM

Commit message is unclear. You added non-prefixed aliases for all registers, yet only $a0 is mentioned. I'd like to see what improvement you could come up with before amending it myself, to hopefully make you better at writing those "small compositions".

clang/lib/Basic/Targets/LoongArch.cpp
54

Missing "s9".

$ cat prefix_regalias.c 
int main()
{
	register unsigned long a0 asm("a0");
	register unsigned long a1 asm("$a1");
	register unsigned long a2 asm("r6");
	register unsigned long a3 asm("$r7");
	register float f0 asm("fa0");
	register float f1 asm("$fa1");
	register float f2 asm("f2");
	register float f3 asm("$f3");

	return 0;
}

clang before applying patch:
$ clang prefix_regalias.c 
prefix_regalias.c:3:32: error: unknown register name 'a0' in asm
        register unsigned long a0 asm("a0");
                                      ^
prefix_regalias.c:5:32: error: unknown register name 'r6' in asm
        register unsigned long a2 asm("r6");
                                      ^
prefix_regalias.c:7:24: error: unknown register name 'fa0' in asm
        register float f0 asm("fa0");
                              ^
prefix_regalias.c:9:24: error: unknown register name 'f2' in asm
        register float f2 asm("f2");
                              ^
4 errors generated.


clang after applying patch:
$ clang prefix_regalias.c 

gcc:
$ gcc prefix_regalias.c 
prefix_regalias.c: In function ‘main’:
prefix_regalias.c:7:24: error: invalid register name for ‘f0’
    7 |         register float f0 asm("fa0");
      |                        ^~
prefix_regalias.c:8:24: error: invalid register name for ‘f1’
    8 |         register float f1 asm("$fa1");
      |          

How about the asm code in .s? Do we need to support addi.d a0, a1, a2?

xen0n added a comment.Oct 21 2022, 3:26 AM

How about the asm code in .s? Do we need to support addi.d a0, a1, a2?

For the assembler part of this support, I think we need to coordinate with the GNU toolchain maintainers of LoongArch port (Chenghua, Zhensong and Lulu I think). Maybe raising an issue on the LoongArch documentation repo would help.

For the GCC part, consistency is of course welcomed, and I think the Loongson maintainers or @xry111 could just submit the respective support.

SixWeining added a comment.EditedOct 21 2022, 8:53 PM

How about the asm code in .s? Do we need to support addi.d a0, a1, a2?

For the assembler part of this support, I think we need to coordinate with the GNU toolchain maintainers of LoongArch port (Chenghua, Zhensong and Lulu I think). Maybe raising an issue on the LoongArch documentation repo would help.

For the GCC part, consistency is of course welcomed, and I think the Loongson maintainers or @xry111 could just submit the respective support.

Well. To be honest, I'd like to keep only one form but not both. Seems other archs only support one form? If we support both, will it make people confused? Is mixed form allowed (fmadd.d fa0, $fa0, f0, $fcc0) ?

Modify commit information and add missing s9.

tangyouling edited the summary of this revision. (Show Details)Oct 23 2022, 6:57 PM

Commit message is unclear. You added non-prefixed aliases for all registers, yet only $a0 is mentioned. I'd like to see what improvement you could come up with before amending it myself, to hopefully make you better at writing those "small compositions".

Updated the description of all registers, thanks for pointing it out.

clang/lib/Basic/Targets/LoongArch.cpp
54

OK, modified.

How about the asm code in .s? Do we need to support addi.d a0, a1, a2?

For the assembler part of this support, I think we need to coordinate with the GNU toolchain maintainers of LoongArch port (Chenghua, Zhensong and Lulu I think). Maybe raising an issue on the LoongArch documentation repo would help.

For the GCC part, consistency is of course welcomed, and I think the Loongson maintainers or @xry111 could just submit the respective support.

Well. To be honest, I'd like to keep only one form but not both. Seems other archs only support one form? If we support both, will it make people confused? Is mixed form allowed (fmadd.d fa0, $fa0, f0, $fcc0) ?

Having just one form will be nice.

How about the asm code in .s? Do we need to support addi.d a0, a1, a2?

For the assembler part of this support, I think we need to coordinate with the GNU toolchain maintainers of LoongArch port (Chenghua, Zhensong and Lulu I think). Maybe raising an issue on the LoongArch documentation repo would help.

For the GCC part, consistency is of course welcomed, and I think the Loongson maintainers or @xry111 could just submit the respective support.

Well. To be honest, I'd like to keep only one form but not both. Seems other archs only support one form? If we support both, will it make people confused? Is mixed form allowed (fmadd.d fa0, $fa0, f0, $fcc0) ?

Having just one form will be nice.

@tangyouling Considering that $ is well supported by both gcc and llvm, we should adandon this patch?

How about the asm code in .s? Do we need to support addi.d a0, a1, a2?

For the assembler part of this support, I think we need to coordinate with the GNU toolchain maintainers of LoongArch port (Chenghua, Zhensong and Lulu I think). Maybe raising an issue on the LoongArch documentation repo would help.

For the GCC part, consistency is of course welcomed, and I think the Loongson maintainers or @xry111 could just submit the respective support.

Well. To be honest, I'd like to keep only one form but not both. Seems other archs only support one form? If we support both, will it make people confused? Is mixed form allowed (fmadd.d fa0, $fa0, f0, $fcc0) ?

Having just one form will be nice.

@tangyouling Considering that $ is well supported by both gcc and llvm, we should adandon this patch?

I'm afraid no, the prefix-less form is AFAIK likely inspired by RISC-V assembly (which doesn't have any prefix for registers whatsoever), but the $-prefixed form is clearly a MIPS style, and frankly speaking we (the community) had no choice at the time of binutils upstreaming because of the insistence of Chenghua et al., and loads of code using the prefixed form already appearing. The important point is that there's already code making use of prefix-less register names in inline asm, so we must follow suit for keeping source compatibility across gcc and clang.

While having just one form to support is indeed nice, that ship has sailed long ago, and I think the next best thing is to allow both styles but disallow mixing of styles in one insn, i.e. no things like add $a0, a1, $a2, if it's impossible to only allow prefix-less register names in Clang inline asm but not at LLVM level.

How about the asm code in .s? Do we need to support addi.d a0, a1, a2?

For the assembler part of this support, I think we need to coordinate with the GNU toolchain maintainers of LoongArch port (Chenghua, Zhensong and Lulu I think). Maybe raising an issue on the LoongArch documentation repo would help.

For the GCC part, consistency is of course welcomed, and I think the Loongson maintainers or @xry111 could just submit the respective support.

Well. To be honest, I'd like to keep only one form but not both. Seems other archs only support one form? If we support both, will it make people confused? Is mixed form allowed (fmadd.d fa0, $fa0, f0, $fcc0) ?

Having just one form will be nice.

@tangyouling Considering that $ is well supported by both gcc and llvm, we should adandon this patch?

I would prefer that the treatment of aliases should be completely consistent in gcc and llvm (otherwise it will cause some trouble for developers).
Because gcc does support non-prefix $ for GR, and would also prefer FR to be consistent with GR (gcc might need to add it).

Is it acceptable to add the non-prefix $? if not, an alternative fix for the build failure is to add the prefix $ in sanitizer_syscall_linux_loongarch64.inc

Is it acceptable to add the non-prefix $? if not, an alternative fix for the build failure is to add the prefix $ in sanitizer_syscall_linux_loongarch64.inc

People have different opinions. This may need long time to discuss and decide (maybe in a broader place). So currently it's better to modify sanitizer_syscall_linux_loongarch64.inc I think.

Is it acceptable to add the non-prefix $? if not, an alternative fix for the build failure is to add the prefix $ in sanitizer_syscall_linux_loongarch64.inc

I wrote sanitizer_syscall_linux_loongarch64.inc and tested it with GCC. GCC supports register variable definition like register u64 a0 asm("a0"); but does not support addi.d a0, a0, a1 (because in GCC such a line in inline assembly is passed to GNU as directly).

Is it possible to "emulate" the behavior in Clang?

Is it acceptable to add the non-prefix $? if not, an alternative fix for the build failure is to add the prefix $ in sanitizer_syscall_linux_loongarch64.inc

I wrote sanitizer_syscall_linux_loongarch64.inc and tested it with GCC. GCC supports register variable definition like register u64 a0 asm("a0"); but does not support addi.d a0, a0, a1 (because in GCC such a line in inline assembly is passed to GNU as directly).

Is it possible to "emulate" the behavior in Clang?

I don't have any objection to add $ into sanitizer_syscall_linux_loongarch64.inc though, GCC supports both register u64 a0 asm("a0"); and register u64 a0 asm("$a0");

Is it acceptable to add the non-prefix $? if not, an alternative fix for the build failure is to add the prefix $ in sanitizer_syscall_linux_loongarch64.inc

People have different opinions. This may need long time to discuss and decide (maybe in a broader place). So currently it's better to modify sanitizer_syscall_linux_loongarch64.inc I think.

Ok, I will send a new Revision to fix sanitizer_syscall_linux_loongarch64.inc separately, but this Revision does not close and continue the discussion.

xen0n added a comment.May 2 2023, 11:07 PM

Hi, any update on this? Given the $-less style has been long established at this time, and present in released GCC versions, it seems we indeed have to follow suit (and later make GCC accept ABI names for FPRs too). It will also make the ClangBuiltLinux work easier.

Hi, any update on this? Given the $-less style has been long established at this time, and present in released GCC versions, it seems we indeed have to follow suit (and later make GCC accept ABI names for FPRs too). It will also make the ClangBuiltLinux work easier.

I tend to adopt @xry111's suggestion that "emulating" GCC's behavior in Clang (not in LLVM):

GCC supports register variable definition like register u64 a0 asm("a0"); but does not support addi.d a0, a0, a1 (because in GCC such a line in inline assembly is passed to GNU as directly).
Is it possible to "emulate" the behavior in Clang?

And for FPRs, I suggest to keep current implementation until GCC support non-prefixed names.

This means:

  • Non-prefixed GPRs names ( and ABI names ) are allowed in register variable definition and clobber list.
  • Only $ prefixed register names is allowed in assembler template.
  • Do not support non-prefixed names for other registers ( including FPRs ) anywhere.

The patch will continue to be updated by @SixWeining .

SixWeining commandeered this revision.May 11 2023, 7:59 PM
SixWeining edited reviewers, added: tangyouling; removed: SixWeining.

Only allow non-prefixed names for GPRs.

SixWeining retitled this revision from [Clang][LoongArch] Add register alias handling without `$` prefix to [Clang][LoongArch] Add GPR alias handling without `$` prefix.May 11 2023, 8:04 PM
SixWeining edited the summary of this revision. (Show Details)
xen0n accepted this revision.May 11 2023, 11:22 PM

LGTM now, thanks for updating this patch!

This revision is now accepted and ready to land.May 11 2023, 11:22 PM