HomePhabricator

[JITLink] Enable exception handling for ELF.

Authored by lhames on Jan 24 2021, 8:14 PM.

Description

[JITLink] Enable exception handling for ELF.

Adds the EHFrameSplitter and EHFrameEdgeFixer passes to the default JITLink
pass pipeline for ELF/x86-64, and teaches EHFrameEdgeFixer to handle some
new pointer encodings.

Together these changes enable exception handling (at least for the basic
cases that I've tested so far) for ELF/x86-64 objects loaded via JITLink.

Event Timeline

Hi!

I noticed that the ELF_ehframe_basic.s fail for me when I've compiled llvm-jitlink with gcc. The CHECKs in the test pass, but llvm-jitlink ends with

Starting link phase 2 for graph /data/repo/master/llvm/build-all-bbigcc/test/ExecutionEngine/JITLink/X86/Output/ELF_ehframe_basic.s.tmp
JIT session error: Symbols not found: [ _ZTIi ]
/data/repo/master/llvm/build-all-bbigcc/bin/llvm-jitlink: Failed to materialize symbols: { (main, { DW.ref.__gxx_personality_v0, foo, main }) }

It doesn't happen when I compile with clang.
I know nothing about llvm-jitlink, I just saw this when building check-all.

Anything you've seen or can make something out of?

Hi!

I noticed that the ELF_ehframe_basic.s fail for me when I've compiled llvm-jitlink with gcc.

Btw, we're using gcc 9.3.0.

Hi Nico,

Are you able to investigate this further? The logs contain:

$ "c:\src\llvm-project\out\gn\bin\llvm-jitlink.exe" "-debug-only=jitlink"
"-define-abs" "bar=0x01" "-noexec"
"C:\src\llvm-project\out\gn\obj\llvm\test\ExecutionEngine\JITLink\X86\Output\ELF_ehframe_basic.s.tmp"
note: command had no output on stdout or stderr
error: command failed with exit status: 1

But no further explanation. I've not seen this failure mode before: Every
llvm-jitlink failure should have at least some output associated with it.

Regards,
Lang.

Hi Mikael,

Thanks for the heads up. This looks like missing a RTTI symbol. Since this
test doesn't actually execute anything I'll fix it by defining a bogus
definition of _ZTIi in the test itself.

  • Lang.

Hi Nico, Mikael,

I've re-committed with a fix for the missing _ZTIi symbol in cda4d3d37f14.
I believe this will fix the issue on the gcc bots.

Nico -- I'm not sure whether this will fix the issue you saw. I've not seen
llvm-jitlink fail without output like that. If it fails with the new commit
too I could use some help debugging it. I've gated the test with REQUIRES:
asserts, but I wonder whether that's sufficient to guarantee that
LLVM_DEBUG works? If the build doesn't include debugging info that may be
the issue.

  • Lang.

Thanks Nico,

I'll ask around and see if I can get someone with a Windows box to
reproduce this.

  • Lang.

Ok -- that's the expected output. It seems like the actual bug is in the
test infrastructure / environment on Windows?

I'd prefer not to revert -- I can't investigate locally and don't want to
block development on this feature. I'll just disable this on Windows for
now.

  • Lang.

Test disabled on Windows in 236b0d040786.

Nico -- Does the test appear to fail for you if you run it under lit,
despite producing output?

  • Lang.

Hi Mikael,

Thanks for the heads up. This looks like missing a RTTI symbol. Since this
test doesn't actually execute anything I'll fix it by defining a bogus
definition of _ZTIi in the test itself.

  • Lang.

Thanks, works for me now!

Hi Nico,

is it expected that it exists with 1? If so, maybe you just need to prepend

not on the RUN line?

No -- it is expected to exit with code 0. I assumed that the exit 1 code
came from FileCheck. If you still have access to that Windows box I would
love to know whether llvm-jitlink itself is returning -1 here, as it will
help track this down.

My best read so far is that something is going wrong with the redirection
and we're not connecting llvm-jitlink's stdout to the pipe to FileCheck.
FileCheck then sees empty input and exits with -1.

If that assessment is correct then the question is why the redirect is
failing. I thought other tests were using similar redirects, but maybe I'm
missing something (e.g. is any special per-directly lit config required to
enable that redirect syntax on Windows?).

  • Lang.

I think lit has a bug (maybe due to the py3 migration?) where it claims
that a process's stdout/stderr are empty when they aren't, at least
sometimes, at least on windows, when that process exits with 1.

Ahh. That'd do it.

Looking more closely at the logs you attached earlier:

JIT session error: Symbols not found: [ cxa_throw, cxa_end_catch,
gxx_personality_v0, cxa_begin_catch, __cxa_allocate_exception ]

Ok -- there's the reason for the -1: Windows was missing more symbols. I'll
add definitions for these and re-enable the test on Windows.

Thanks very much for the help tracking this down!

  • Lang.