HomePhabricator

Revert "[Compiler-rt][AArch64] Workaround for .cfi_startproc assembler parser…

Authored by tambre on Dec 16 2020, 12:14 AM.

Description

Revert "[Compiler-rt][AArch64] Workaround for .cfi_startproc assembler parser bug."

039cb03dd0dcff4daa17a062f7414ac22bf5f2eb (D93378) fixed the assembly separator, so the workaround is no longer necessary.

This reverts commit 3000c19df64f89ff319590f3a6e4d6b93d20983d.

Differential Revision: https://reviews.llvm.org/D93379

Event Timeline

@tambre
Hi Raul,

I'm troubleshooting LLVM build on Windows on Arm, which is currently failing with
...
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
...
when building LSE builtins. It seems that clang-cl.exe, which, presumably, uses masm parser falls over when .cfi_startproc is not at the beginning of the line. Reverting this patch and re-instating the workaround solves the problem.

Looking at https://reviews.llvm.org/rG039cb03dd0dcff4daa17a062f7414ac22bf5f2eb I don't understand how that fixes the problem of masm parser. Could you explain how it is supposed to work, please?

Thanks!

tambre added a comment.Feb 5 2021, 5:17 AM

@tambre
Hi Raul,

I'm troubleshooting LLVM build on Windows on Arm, which is currently failing with
...
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
...
when building LSE builtins. It seems that clang-cl.exe, which, presumably, uses masm parser falls over when .cfi_startproc is not at the beginning of the line. Reverting this patch and re-instating the workaround solves the problem.

Looking at https://reviews.llvm.org/rG039cb03dd0dcff4daa17a062f7414ac22bf5f2eb I don't understand how that fixes the problem of masm parser. Could you explain how it is supposed to work, please?

Thanks!

The issue turned out to not be parser bug, but rather the incorrect statement separator being used by compiler-rt on certain platforms.
This workaround was incorrect. While it made the error go away, some of the assembly would stay as comments due to the wrong separator and the code would not work or compile under certain conditions.

Could you provide the full compile command that throws an error during your build?
I should have time to investigate during the weekend.

...

The issue turned out to not be parser bug, but rather the incorrect statement separator being used by compiler-rt on certain platforms.
This workaround was incorrect. While it made the error go away, some of the assembly would stay as comments due to the wrong separator and the code would not work or compile under certain conditions.

Could you provide the full compile command that throws an error during your build?
I should have time to investigate during the weekend.

Hi Raul,

I'm seeing this only on aarch64-windows-msvc target, and not on any aarch64-linux-gnu targets. The command line is below, but I'm not sure you'll reproduce the problem outside of Windows environment.

C:\Users\tcwg\source\llvm-package\bin\clang-cl.exe /nologo -DHAS_ASM_LSE -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -DSTDC_CONSTANT_MACROS -DSTDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DL_cas -DMODEL=3 -DSIZE=2 -IC:\Users\tcwg\source\maxim\llvm_package_92a5106e\llvm-project\compiler-rt\lib\builtins -Iprojects\compiler-rt\lib\builtins -IC:\Users\tcwg\source\maxim\llvm_package_92a5106e\llvm-project\compiler-rt\lib\builtins -Iinclude -IC:\Users\tcwg\source\maxim\llvm_package_92a5106e\llvm-project\llvm\include -march=armv8-a -DCOMPILER_RT_HAS_FLOAT16 /showIncludes /Foprojects\compiler-rt\lib\builtins\CMakeFiles\clang_rt.builtins-aarch64.dir\outline_atomic_helpers.dir\outline_atomic_cas2_3.S.obj /Fdprojects\compiler-rt\lib\builtins\CMakeFiles\clang_rt.builtins-aarch64.dir\clang_rt.builtins-aarch64.pdb -c -- projects\compiler-rt\lib\builtins\outline_atomic_helpers.dir\outline_atomic_cas2_3.S

tambre added a subscriber: MaskRay.Feb 6 2021, 1:02 AM

Hi Raul,

I'm seeing this only on aarch64-windows-msvc target, and not on any aarch64-linux-gnu targets. The command line is below, but I'm not sure you'll reproduce the problem outside of Windows environment.

C:\Users\tcwg\source\llvm-package\bin\clang-cl.exe /nologo -DHAS_ASM_LSE -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -DSTDC_CONSTANT_MACROS -DSTDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DL_cas -DMODEL=3 -DSIZE=2 -IC:\Users\tcwg\source\maxim\llvm_package_92a5106e\llvm-project\compiler-rt\lib\builtins -Iprojects\compiler-rt\lib\builtins -IC:\Users\tcwg\source\maxim\llvm_package_92a5106e\llvm-project\compiler-rt\lib\builtins -Iinclude -IC:\Users\tcwg\source\maxim\llvm_package_92a5106e\llvm-project\llvm\include -march=armv8-a -DCOMPILER_RT_HAS_FLOAT16 /showIncludes /Foprojects\compiler-rt\lib\builtins\CMakeFiles\clang_rt.builtins-aarch64.dir\outline_atomic_helpers.dir\outline_atomic_cas2_3.S.obj /Fdprojects\compiler-rt\lib\builtins\CMakeFiles\clang_rt.builtins-aarch64.dir\clang_rt.builtins-aarch64.pdb -c -- projects\compiler-rt\lib\builtins\outline_atomic_helpers.dir\outline_atomic_cas2_3.S

Thanks. This can be reproduced with:
build/bin/clang-cl.exe -DL_cas -DMODEL=3 -DSIZE=2 --target=aarch64-windows-msvc -Icompiler-rt\lib\builtins\ compiler-rt\lib\builtins\aarch64\lse.S

Turns out Microsoft COFF style assembler's comment syntax like Apple is ;, but it's the only one lacking a statement separator syntax.
C preprocessor doesn't support expanding newlines from macros either.

There were probably important statements ending up as comments before this, but I guess no one had stumbled upon them yet.

Avoiding the issue would mean forgoing the macros completely or adding another preprocessing step for Windows only for all assembly files.
Either would suck and doesn't seem acceptable.

aarch64-windows-gnu might be close enough to not cause any problems?
It support the usual comment and statement separator syntax.

Alterantively, maybe a compiler extension to support statement separators?
The official grammar doesn't use %%, we could use that similar to Apple.

@MaskRay You fixed the original issue with Apple seperator syntax in D93378. Thoughts, ideas?

tambre added a subscriber: ilinpv.Feb 6 2021, 9:58 AM

when building LSE builtins. It seems that clang-cl.exe, which, presumably, uses masm parser

A few clarifications here. When clang-cl (for x86) parses assembly, it parses GAS-style .S assembly, including support for GAS macros and all that. MASM (ml.exe/ml64.exe) style assembly is something entirely different, and that's only handled by llvm-ml (which is a work in progress). So there's no real reference for what the MSVC-dialect (triggered by x86*-windows-msvc triples) of GAS assembly would be, as no such thing really exists, so in practice I guess it's just mostly equivalent to the corresponding x86*-windows-gnu ones.

Same thing for aarch64; there's no MSVC dialect of GAS assembly as reference for what llvm/clang should do with aarch64-windows-msvc targets. And for arm/aarch64 in general, there's no MASM format at all; MSVC tools ship with armasm.exe/armasm64.exe, which use the ARM (company) armasm format (which, just like MASM for x86, uses entirely different directives than GAS).

So with that in mind, my preference would be to just make the aarch64-windows-msvc targets behave just like the -gnu ones in this aspect. That was my initial intent when I fixed the comment char for aarch64 windows targets in D36366 (to make semicolon work as separator), but in review it was requested to only make that change for -gnu triples, see https://reviews.llvm.org/D36366#833185. Maybe we should try to revisit that discussion?

In the meantime, readding this workaround would seem sensible to me, in order not to rely on separators in assembly.

Thanks for the background.

In the meantime, readding this workaround would seem sensible to me, in order not to rely on separators in assembly.

The workaround wasn't sufficient and only hid the compile error, but would still not work in certain configurations as reported in https://reviews.llvm.org/D93278#2457928.
compiler-rt relies heavily on macros with statement separators. I don't think it's feasible to manually expand and copy-paste likely hundreds of expansions to not rely on statement separators.

So with that in mind, my preference would be to just make the aarch64-windows-msvc targets behave just like the -gnu ones in this aspect. That was my initial intent when I fixed the comment char for aarch64 windows targets in D36366 (to make semicolon work as separator), but in review it was requested to only make that change for -gnu triples, see https://reviews.llvm.org/D36366#833185. Maybe we should try to revisit that discussion?

Yes. We should backport this to 12.0 before release if possible.
I opened bug 49075 to track this as a release blocker. Would you be willing to take it?

dmajor added a subscriber: dmajor.Feb 7 2021, 8:28 AM

...

So with that in mind, my preference would be to just make the aarch64-windows-msvc targets behave just like the -gnu ones in this aspect. That was my initial intent when I fixed the comment char for aarch64 windows targets in D36366 (to make semicolon work as separator), but in review it was requested to only make that change for -gnu triples, see https://reviews.llvm.org/D36366#833185. Maybe we should try to revisit that discussion?

Yes. We should backport this to 12.0 before release if possible.
I opened bug 49075 to track this as a release blocker. Would you be willing to take it?

I'm building LLVM 12 release binaries for Windows on Arm, and this seems to be the only problem left.
@tambre , as a fallback, is it possible to disable building LSE atomics via a cmake option? I tried COMPILER_RT_HAS_ASM_LSE=OFF, but it didn't work.

ilinpv added a comment.Feb 8 2021, 4:42 AM

@tambre , as a fallback, is it possible to disable building LSE atomics via a cmake option? I tried COMPILER_RT_HAS_ASM_LSE=OFF, but it didn't work.

There is no such option, as clang outline atomics (enabled by default for Linux so far with intend to extend it for other targets) are rely on compiler-rt support as well.
I would recommend to backport this workaround if statement separator for MSVC takes long.

when building LSE builtins. It seems that clang-cl.exe, which, presumably, uses masm parser

A few clarifications here. When clang-cl (for x86) parses assembly, it parses GAS-style .S assembly, including support for GAS macros and all that. MASM (ml.exe/ml64.exe) style assembly is something entirely different, and that's only handled by llvm-ml (which is a work in progress). So there's no real reference for what the MSVC-dialect (triggered by x86*-windows-msvc triples) of GAS assembly would be, as no such thing really exists, so in practice I guess it's just mostly equivalent to the corresponding x86*-windows-gnu ones.

It seems I'm still missing some key piece of the puzzle here. If I understand your comment correctly, you say that ";" should be a statement separator for *-windows-msvc targets. But when I build .S files of LLVM 12 using LLVM 10 compiler on Windows on Arm, I do see assembler dropping everything after ";" -- i.e., at least LLVM 10's integrated assembler appears to be configured to treat ";" as beginning of a comment, with no apparent statement separator syntax. This is, apparently, configured in AArch64MCAsmInfoMicrosoftCOFF::AArch64MCAsmInfoMicrosoftCOFF .

Am I missing something?

when building LSE builtins. It seems that clang-cl.exe, which, presumably, uses masm parser

A few clarifications here. When clang-cl (for x86) parses assembly, it parses GAS-style .S assembly, including support for GAS macros and all that. MASM (ml.exe/ml64.exe) style assembly is something entirely different, and that's only handled by llvm-ml (which is a work in progress). So there's no real reference for what the MSVC-dialect (triggered by x86*-windows-msvc triples) of GAS assembly would be, as no such thing really exists, so in practice I guess it's just mostly equivalent to the corresponding x86*-windows-gnu ones.

It seems I'm still missing some key piece of the puzzle here. If I understand your comment correctly, you say that ";" should be a statement separator for *-windows-msvc targets.

The comment was about *-windows-msvc targets in general, in particular about x86 targets.

But when I build .S files of LLVM 12 using LLVM 10 compiler on Windows on Arm, I do see assembler dropping everything after ";" -- i.e., at least LLVM 10's integrated assembler appears to be configured to treat ";" as beginning of a comment, with no apparent statement separator syntax. This is, apparently, configured in AArch64MCAsmInfoMicrosoftCOFF::AArch64MCAsmInfoMicrosoftCOFF .

Yes, for the aarch64-windows-msvc target, currently semicolon is set as both comment and separator, making no actual char available as separator symbol. I'd suggest changing the comment char to '//' for aarch64-windows-msvc, like it is for aarch64-windows-gnu and ELF targets. I initially tried to do this, but was suggested to only change the -gnu target in the review I linked before. At the time I shrugged and just went with the suggestion, but at the current state I'd suggest revisiting this decision, as there's a clear benefit for making it match the other platforms, and there is no reference for what GAS assembly for aarch64 msvc targets should look like anyway.

And, looking at AArch64MCAsmInfoMicrosoftCOFF, I think we need to do the same for MSVC COFF as for GNU COFF in https://reviews.llvm.org/D36366 -- use "//" for comment, which should return ";" to being statement separator.

...

Yes, for the aarch64-windows-msvc target, currently semicolon is set as both comment and separator, making no actual char available as separator symbol. I'd suggest changing the comment char to '//' for aarch64-windows-msvc, like it is for aarch64-windows-gnu and ELF targets. I initially tried to do this, but was suggested to only change the -gnu target in the review I linked before. At the time I shrugged and just went with the suggestion, but at the current state I'd suggest revisiting this decision, as there's a clear benefit for making it match the other platforms, and there is no reference for what GAS assembly for aarch64 msvc targets should look like anyway.

Agree. Would you please post a patch for this?

@compnerd , fyi, since you requested GNU specialization in https://reviews.llvm.org/D36366 .