This is an archive of the discontinued LLVM Phabricator instance.

[MC][AsmParser] Diagnose improperly nested .cfi frames
ClosedPublic

Authored by jroelofs on Jul 13 2023, 4:03 PM.

Details

Summary

This showed up when simplifying some large testcase, where the cfi directives became out of sync with the proc's they enclose.

rdar://111459507

Diff Detail

Event Timeline

jroelofs created this revision.Jul 13 2023, 4:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2023, 4:03 PM
jroelofs requested review of this revision.Jul 13 2023, 4:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2023, 4:03 PM
jroelofs edited the summary of this revision. (Show Details)Jul 13 2023, 4:03 PM

on the subject of is there any way to diagnose:

  • if we encounter either a linker-visible or global symbol after a cfi_startproc but before the next cfi_endproc [in the same section], that would seem to indicate bad nesting?
  • I suppose that there could be some weird code-gen that multiplexed the output between sections, so seeing a section switch before the closing cfi_endproc is not sufficient
  • I suppose that there could be some weird code-gen that multiplexed the output between sections, so seeing a section switch before the closing cfi_endproc is not sufficient

I tried this one and found it broke a lot of JIT tests, as well as some cases with inline asm.

  • if we encounter either a linker-visible or global symbol after a cfi_startproc but before the next cfi_endproc [in the same section], that would seem to indicate bad nesting?

I'll give this one a shot. I suspect it may also trip up on the same sorts of issues.

D153167 is related. Due to Mach-O's .subsections_via_symbols mechanism, non-private labels cannot appear between .cfi_startproc/.cfi_endproc. I think AArch64/cfi-bad-nesting.s deserves a getContext().reportError.

jroelofs updated this revision to Diff 558108.Nov 15 2023, 10:12 AM

Address review feedback from @iains and @MaskRay

jroelofs retitled this revision from AArch64: don't crash when .cfi_startproc/.cfi_endproc are improperly nested to [MC][AsmParse] Diagnose improperly nested .cfi frames.
jroelofs edited the summary of this revision. (Show Details)
jroelofs edited the summary of this revision. (Show Details)
MaskRay accepted this revision.Thu, Nov 16, 11:22 PM

Looks great!

llvm/lib/MC/MCParser/AsmParser.cpp
1953

Add a test for isWeakExternal?

llvm/test/MC/MachO/AArch64/cfi-bad-nesting.s
1 ↗(On Diff #558108)

We could also use test/MC/MachO/ since this is generic, not specific to AArch64/

This revision is now accepted and ready to land.Thu, Nov 16, 11:22 PM
MaskRay retitled this revision from [MC][AsmParse] Diagnose improperly nested .cfi frames to [MC][AsmParser] Diagnose improperly nested .cfi frames.Thu, Nov 16, 11:22 PM

[MC][AsmParse] Diagnose improperly nested .cfi frames

Fixed a typo in the title.

jroelofs added inline comments.Fri, Nov 17, 8:57 AM
llvm/lib/MC/MCParser/AsmParser.cpp
1953

I realize now that that is a COFF feature, and .subsections_via_symbols is a MachO thing. I'll leave this out instead.

llvm/test/MC/MachO/AArch64/cfi-bad-nesting.s
1 ↗(On Diff #558108)

Sounds good. I'll add the appropriate REQUIRES: too.

This revision was landed with ongoing or failed builds.Fri, Nov 17, 9:45 AM
This revision was automatically updated to reflect the committed changes.

this is breaking runtimes builds of aarch64 builtins, e.g.

[294/404] Building ASM object CMakeFiles\clang_rt.builtins-aarch64.dir\outline_atomic_helpers.dir\outline_atomic_cas16_1.S.obj
FAILED: CMakeFiles/clang_rt.builtins-aarch64.dir/outline_atomic_helpers.dir/outline_atomic_cas16_1.S.obj 
C:\b\s\w\ir\cache\builder\src\third_party\llvm-build\Release+Asserts\.\bin\clang-cl.exe -target aarch64-pc-windows-msvc  /nologo -DHAS_ASM_LSE -DL_cas -DMODEL=1 -DSIZE=16 -IC:\b\s\w\ir\cache\builder\src\third_party\llvm\compiler-rt\lib\builtins -fno-lto /Zl -fno-builtin -DCOMPILER_RT_HAS_FLOAT16 /showIncludes /FoCMakeFiles\clang_rt.builtins-aarch64.dir\outline_atomic_helpers.dir\outline_atomic_cas16_1.S.obj /FdCMakeFiles\clang_rt.builtins-aarch64.dir\clang_rt.builtins-aarch64.pdb -c -- C:\b\s\w\ir\cache\builder\src\third_party\llvm-build\Release+Asserts\runtimes\builtins-aarch64-pc-windows-msvc-bins\outline_atomic_helpers.dir\outline_atomic_cas16_1.S
C:\\b\\s\\w\\ir\\cache\\builder\\src\\third_party\\llvm-build\\Release+Asserts\\runtimes\\builtins-aarch64-pc-windows-msvc-bins\\outline_atomic_helpers.dir\\outline_atomic_cas16_1.S:145:131: error: non-private labels cannot appear between .cfi_startproc / .cfi_endproc pairs
 .text ; .balign 16 ; .globl __aarch64_cas16_relax ; .def __aarch64_cas16_relax ; .scl 2 ; .type 32 ; .endef ; ; .cfi_startproc ; __aarch64_cas16_relax: ;
                                                                                                                                  ^
C:\\b\\s\\w\\ir\\cache\\builder\\src\\third_party\\llvm-build\\Release+Asserts\\runtimes\\builtins-aarch64-pc-windows-msvc-bins\\outline_atomic_helpers.dir\\outline_atomic_cas16_1.S:145:114: error: previous .cfi_startproc was here
 .text ; .balign 16 ; .globl __aarch64_cas16_relax ; .def __aarch64_cas16_relax ; .scl 2 ; .type 32 ; .endef ; ; .cfi_startproc ; __aarch64_cas16_relax: ;

this is at https://github.com/llvm/llvm-project/blob/6a126e279dedc9fb8c204d29cfc227b0652ffe6c/compiler-rt/lib/builtins/aarch64/lse.S#L145 (need to follow a bunch of macros). the builtins should be cleaned up first before landing this

Ok. I'll revert.

Addressed in 7939ce39dac0, and re-landed in 4323da926f12

This breaks building libffi for 32 bit arm; the following snippet does fallthrough from one global function to another, while sharing one single CFI region for all of them: https://github.com/libffi/libffi/blob/master/src/arm/sysv.S#L139-L213

Based on this patch, the code in libffi looks like it would be invalid and intentionally should fail to compile.

However, I noticed a different surprising detail here; it looks like this check fails to trigger on Linux (ELF in general I presume?) targets, while it does trigger on MachO and COFF. This can be tested with something like this:

.globl ffi_call_VFP
ffi_call_VFP:
.cfi_startproc
nop
.globl ffi_call_SYSV
ffi_call_SYSV:
nop
.cfi_endproc
$ clang -target aarch64-linux-gnu -c fallthrough.s 
$ clang -target aarch64-apple-darwin -c fallthrough.s 
fallthrough.s:7:1: error: non-private labels cannot appear between .cfi_startproc / .cfi_endproc pairs 
ffi_call_SYSV:
^
fallthrough.s:3:2: error: previous .cfi_startproc was here
.cfi_startproc 
^
$ clang -target aarch64-windows-gnu -c fallthrough.s 
fallthrough.s:7:1: error: non-private labels cannot appear between .cfi_startproc / .cfi_endproc pairs
ffi_call_SYSV:
^
fallthrough.s:3:2: error: previous .cfi_startproc was here
.cfi_startproc
^

For cross reference, I filed an issue with libffi at https://github.com/libffi/libffi/issues/807 about this.

This breaks building libffi for 32 bit arm; the following snippet does fallthrough from one global function to another, while sharing one single CFI region for all of them: https://github.com/libffi/libffi/blob/master/src/arm/sysv.S#L139-L213

Based on this patch, the code in libffi looks like it would be invalid and intentionally should fail to compile.

I may need to restrict this to just MachO, and possibly only when .subsections_via_symbols is present. If this is blocking you, please feel free to revert.

However, I noticed a different surprising detail here; it looks like this check fails to trigger on Linux (ELF in general I presume?) targets, while it does trigger on MachO and COFF. This can be tested with something like this:

.globl ffi_call_VFP
ffi_call_VFP:
.cfi_startproc
nop
.globl ffi_call_SYSV
ffi_call_SYSV:
nop
.cfi_endproc
$ clang -target aarch64-linux-gnu -c fallthrough.s 
$ clang -target aarch64-apple-darwin -c fallthrough.s 
fallthrough.s:7:1: error: non-private labels cannot appear between .cfi_startproc / .cfi_endproc pairs 
ffi_call_SYSV:
^
fallthrough.s:3:2: error: previous .cfi_startproc was here
.cfi_startproc 
^
$ clang -target aarch64-windows-gnu -c fallthrough.s 
fallthrough.s:7:1: error: non-private labels cannot appear between .cfi_startproc / .cfi_endproc pairs
ffi_call_SYSV:
^
fallthrough.s:3:2: error: previous .cfi_startproc was here
.cfi_startproc
^

That is really weird. I'll investigate this week.

https://github.com/llvm/llvm-project/issues/72802

For cross reference, I filed an issue with libffi at https://github.com/libffi/libffi/issues/807 about this.

Thank you!

This breaks building libffi for 32 bit arm; the following snippet does fallthrough from one global function to another, while sharing one single CFI region for all of them: https://github.com/libffi/libffi/blob/master/src/arm/sysv.S#L139-L213

Based on this patch, the code in libffi looks like it would be invalid and intentionally should fail to compile.

I may need to restrict this to just MachO, and possibly only when .subsections_via_symbols is present. If this is blocking you, please feel free to revert.

Thanks! I pushed a revert now, and thanks for considering restricting the scope of this error!

However, I noticed a different surprising detail here; it looks like this check fails to trigger on Linux (ELF in general I presume?) targets, while it does trigger on MachO and COFF.

That is really weird. I'll investigate this week.

https://github.com/llvm/llvm-project/issues/72802

Thanks, much appreciated!

hans added a subscriber: hans.Thu, Nov 23, 4:56 AM

We're still having issues (https://crbug.com/1504532) due to this, because we're building some compiler-rt code from llvm 17 with clang from head.

That's something for us to figure out, but it makes we wonder if perhaps this diagnostic should be a warning rather than a hard error by default? For example, consider someone using clang 18 to build older llvm version for bisection or similar.

hans added a comment.Fri, Dec 1, 8:17 AM

Ping on thoughts about this being a warning instead?