This is an archive of the discontinued LLVM Phabricator instance.

[test] [ExecutionEngine] Skip the ExecutionEngine tests on mingw targets
AbandonedPublic

Authored by mstorsjo on May 5 2023, 2:38 PM.

Details

Summary

These tests used to be supported (somewhat) for mingw targets,
but since c42c67ad60449fe19949f2664c2a5878b3f72b7e, which made
Orc the default JIT engine, these tests are failing. (That commit
made most tests under the ExecutionEngine/MCJIT directory run with
both "lli -jit-kind=mcjit" for explicitly running with MCJIT, and
with just plain "lli" which defaults to Orc.)

Orc doesn't seem to work for mingw targets; most tests fail with
an undefined reference to the "__main" symbol.

While fixing it would be better than just marking the tests as
unsupported, this works as a stopgap measure, unblocking running
the rest of the LLVM tests.

Diff Detail

Event Timeline

mstorsjo created this revision.May 5 2023, 2:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2023, 2:38 PM
mstorsjo requested review of this revision.May 5 2023, 2:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2023, 2:38 PM
lhames added a comment.May 7 2023, 7:02 PM

@mstorsjo I think I see the underlying issue here: https://github.com/llvm/llvm-project/blob/79702f7f593dece7afb67fec03df884d50525b96/llvm/lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp#L268

Could you please try this fix instead:

diff --git a/llvm/tools/lli/lli.cpp b/llvm/tools/lli/lli.cpp
index bc1c9cecbb77..fade24c57898 100644
--- a/llvm/tools/lli/lli.cpp
+++ b/llvm/tools/lli/lli.cpp
@@ -824,6 +824,16 @@ loadModule(StringRef Path, orc::ThreadSafeContext TSCtx) {
   return orc::ThreadSafeModule(std::move(M), std::move(TSCtx));
 }
 
+int mingw_noop_main(void) {
+  // Cygwin and Mingw insert calls to __main from the generated main. The
+  // __main function is usually defined in a standard library and responsible
+  // for setting up main's environment (e.g. running static ctors), but in our
+  // case the host executor will have taken care of non-JIT ctors, and ORC will
+  // take care of JIT'd ones. To avoid a missing symbol error we can just
+  // point __main at this noop function.
+  return 0;
+}
+
 int runOrcJIT(const char *ProgName) {
   // Start setting up the JIT environment.
 
@@ -989,6 +999,16 @@ int runOrcJIT(const char *ProgName) {
                                                       Mangle));
   }
 
+  // If this is a Mingw or Cygwin executor then we need to alias __main to
+  // orc_rt_int_void_return_0.
+  if (J->getTargetTriple().isOSCygMing())
+    ExitOnErr(J->getProcessSymbolsJITDylib()->define(
+        orc::absoluteSymbols({
+            { J->mangleAndIntern("__main"),
+              { orc::ExecutorAddr::fromPtr(mingw_noop_main),
+                JITSymbolFlags::Exported } }
+          })));
+
   // Regular modules are greedy: They materialize as a whole and trigger
   // materialization for all required symbols recursively. Lazy modules go
   // through partitioning and they replace outgoing calls with reexport stubs

If that works then I'll commit it.

@mstorsjo I think I see the underlying issue here: https://github.com/llvm/llvm-project/blob/79702f7f593dece7afb67fec03df884d50525b96/llvm/lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp#L268

Could you please try this fix instead:

diff --git a/llvm/tools/lli/lli.cpp b/llvm/tools/lli/lli.cpp
index bc1c9cecbb77..fade24c57898 100644
--- a/llvm/tools/lli/lli.cpp
+++ b/llvm/tools/lli/lli.cpp
@@ -824,6 +824,16 @@ loadModule(StringRef Path, orc::ThreadSafeContext TSCtx) {
   return orc::ThreadSafeModule(std::move(M), std::move(TSCtx));
 }
 
+int mingw_noop_main(void) {
+  // Cygwin and Mingw insert calls to __main from the generated main. The
+  // __main function is usually defined in a standard library and responsible
+  // for setting up main's environment (e.g. running static ctors), but in our
+  // case the host executor will have taken care of non-JIT ctors, and ORC will
+  // take care of JIT'd ones. To avoid a missing symbol error we can just
+  // point __main at this noop function.
+  return 0;
+}
+
 int runOrcJIT(const char *ProgName) {
   // Start setting up the JIT environment.
 
@@ -989,6 +999,16 @@ int runOrcJIT(const char *ProgName) {
                                                       Mangle));
   }
 
+  // If this is a Mingw or Cygwin executor then we need to alias __main to
+  // orc_rt_int_void_return_0.
+  if (J->getTargetTriple().isOSCygMing())
+    ExitOnErr(J->getProcessSymbolsJITDylib()->define(
+        orc::absoluteSymbols({
+            { J->mangleAndIntern("__main"),
+              { orc::ExecutorAddr::fromPtr(mingw_noop_main),
+                JITSymbolFlags::Exported } }
+          })));
+
   // Regular modules are greedy: They materialize as a whole and trigger
   // materialization for all required symbols recursively. Lazy modules go
   // through partitioning and they replace outgoing calls with reexport stubs

If that works then I'll commit it.

Thanks, that does indeed solve most of the issue!

In a handful of tests, I'm still seeing failures - I get undefined references to __chkstk_ms. Whenever a function allocates more than 4 KB of stack, the compiler generates a call to this function in the prologue. However this does happen in MSVC mode too (the function just is called __chkstk instead of __chkstk_ms), and as these tests do run successfully in MSVC mode, I'm not quite sure what differs here. I also tried to change the -mtriple= parameter as passed to lli in the test, to the same as in the MSVC case where the test passes, and it still fails in the same way then. Conversely, an MSVC-built lli.exe runs the test just fine even if I set the -mtriple= parameter to that in the mingw case. Any hints on how to look closer into what's going on here? (E.g. any options for just spitting out what code it generated instead of trying to link it, to see if it's a matter of generating the wrong code, or failing to handle some case in the runtime linking?)

Conversely, an MSVC-built lli.exe runs the test just fine even

You could check what symbols this lli.exe exports. It might have both, __chkstk and __chkstk_ms, so that they are resolved from the host process in the MSVC case? Same would be interesting for the MinGW case. I guess it exports neither of them?

Not sure yet what the correct solution would look like. Maybe feed one example into a static linker and see how it deals with that?

One more guess: Could this function possibly reside in a runtime library for MinGW? lli has --orc-runtime but it's not in use implicitly. As far as I know, we only support MSVC. Would MinGW support be interesting? Do we have anything in place? Maybe @sunho knows more about it. Thanks!

Conversely, an MSVC-built lli.exe runs the test just fine even

You could check what symbols this lli.exe exports. It might have both, __chkstk and __chkstk_ms, so that they are resolved from the host process in the MSVC case? Same would be interesting for the MinGW case. I guess it exports neither of them?

MSVC built executables don't export any symbols at all (they don't have a symbol table; they could have DLL-style exported symbols though, but this one doesn't). The MinGW built lli.exe does contain ___chkstk_ms in the symbol table.

Not sure yet what the correct solution would look like. Maybe feed one example into a static linker and see how it deals with that?

Are there any options I could inject here, to make it more verbose about what it does, to inspect the code generation vs linking cases here?

As the -mtriple= option didn't seem to make any difference, I also tried modifying sys::getDefaultTargetTriple() to return an MSVC style triple, but even then, the MinGW built lli.exe fails with references to the MinGW style named ___chckstk_ms symbol. So there's something in the Orc jit linking step that always goes for the MinGW style name here. What do I do to gain insight into what's actually going on in the Orc jit linking step?

One more guess: Could this function possibly reside in a runtime library for MinGW?

The function normally does reside in a runtime library, yes, either libgcc or compiler-rt builtins. However in the MSVC case, the corresponding __chkstk function also is required, and is provided by a toolchain runtime library in the same way. How do I see if the linking step does resolve that symbol, and from where?

lli has --orc-runtime but it's not in use implicitly. As far as I know, we only support MSVC. Would MinGW support be interesting? Do we have anything in place? Maybe @sunho knows more about it. Thanks!

I guess it might be interesting - but as you say, that's not happening implicitly for the MSVC case here either.

Also just to set the scope here; I'm primarily working towards having check-llvm (and similar for other tools) run successfully in MinGW configurations - there are holes to fill in in various components, so I can't take on full the "let's implement everything" role right now; I just want to either flag known failing cases as such, and for trivial cases (like this seems like it would be), fix it by figuring out what is differing between the MSVC and MinGW cases.

lhames added a comment.May 8 2023, 9:35 PM

MSVC built executables don't export any symbols at all (they don't have a symbol table; they could have DLL-style exported symbols though, but this one doesn't). The MinGW built lli.exe does contain ___chkstk_ms in the symbol table.

I think that we should export the symbols from lli. When compiler-rt is linked into JIT'd code we'll pick the definition from there (this is the Right Thing to do if we're trying to model a real link). When it compiler-rt isn't linked into JIT'd code (which it won't be in check-llvm tests) we'll fall back to the version in lli, which should be fine in practice.

Are there any options I could inject here, to make it more verbose about what it does, to inspect the code generation vs linking cases here?

As the -mtriple= option didn't seem to make any difference, I also tried modifying sys::getDefaultTargetTriple() to return an MSVC style triple, but even then, the MinGW built lli.exe fails with references to the MinGW style named ___chckstk_ms symbol. So there's something in the Orc jit linking step that always goes for the MinGW style name here. What do I do to gain insight into what's actually going on in the Orc jit linking step?

I don't think the link step is going to be relevant here: I'm pretty sure the difference is in X86TargetLowering::getStackProbeSymbolName:

// We need a stack probe to conform to the Windows ABI. Choose the right                                                                                                                                                                                                                               
// symbol.                                                                                                                                                                                                                                                                                             
if (Subtarget.is64Bit())
  return Subtarget.isTargetCygMing() ? "___chkstk_ms" : "__chkstk";

You could break on this function and check the return value to verify this.

I think lli takes the triple from the first input file. Does any of your IR have triples in it?

One more guess: Could this function possibly reside in a runtime library for MinGW?

The function normally does reside in a runtime library, yes, either libgcc or compiler-rt builtins. However in the MSVC case, the corresponding __chkstk function also is required, and is provided by a toolchain runtime library in the same way. How do I see if the linking step does resolve that symbol, and from where?

There's -debug-only=orc, but that's going to tell us (very verbosely) that it couldn't find the symbol in any of the places that It was told to look. In lli the places that it will look are (1) the set of input files, then (2) any extra archives, then (3) symbols exported by the process.

lli has --orc-runtime but it's not in use implicitly. As far as I know, we only support MSVC. Would MinGW support be interesting? Do we have anything in place? Maybe @sunho knows more about it. Thanks!

I guess it might be interesting - but as you say, that's not happening implicitly for the MSVC case here either.

Also just to set the scope here; I'm primarily working towards having check-llvm (and similar for other tools) run successfully in MinGW configurations - there are holes to fill in in various components, so I can't take on full the "let's implement everything" role right now; I just want to either flag known failing cases as such, and for trivial cases (like this seems like it would be), fix it by figuring out what is differing between the MSVC and MinGW cases.

Yeah -- Conceptually we want an ORC-runtime-lite embedded in LLVM to handle check-llvm cases (otherwise LLVM testing would depend on compiler-rt). In practice ORC-runtime-lite (especially when it comes to compiler-rt symbols) is usually going to involve either exporting the compile-rt symbol from the executable, pointing the symbol at abort (or something like it) if it's not actually intended to be used, or writing a simple implementation in LLVM's ORC library.

MSVC built executables don't export any symbols at all (they don't have a symbol table; they could have DLL-style exported symbols though, but this one doesn't). The MinGW built lli.exe does contain ___chkstk_ms in the symbol table.

I think that we should export the symbols from lli. When compiler-rt is linked into JIT'd code we'll pick the definition from there (this is the Right Thing to do if we're trying to model a real link). When it compiler-rt isn't linked into JIT'd code (which it won't be in check-llvm tests) we'll fall back to the version in lli, which should be fine in practice.

I see here that this succeeds since it finds an exported __chkstk symbol in kernelbase.dll that happens to be loaded. (Normally, nothing ever imports the __chkstk symbol from another DLL, it's always a statically linked minimal helper function.) So that explains why it works with the __chkstk name but not for __chkstk_ms.

I guess making lli export symbols for them to be found at runtime, might be one solution, let me think about that a bit...

Are there any options I could inject here, to make it more verbose about what it does, to inspect the code generation vs linking cases here?

As the -mtriple= option didn't seem to make any difference, I also tried modifying sys::getDefaultTargetTriple() to return an MSVC style triple, but even then, the MinGW built lli.exe fails with references to the MinGW style named ___chckstk_ms symbol. So there's something in the Orc jit linking step that always goes for the MinGW style name here. What do I do to gain insight into what's actually going on in the Orc jit linking step?

I don't think the link step is going to be relevant here: I'm pretty sure the difference is in X86TargetLowering::getStackProbeSymbolName:

// We need a stack probe to conform to the Windows ABI. Choose the right                                                                                                                                                                                                                               
// symbol.                                                                                                                                                                                                                                                                                             
if (Subtarget.is64Bit())
  return Subtarget.isTargetCygMing() ? "___chkstk_ms" : "__chkstk";

You could break on this function and check the return value to verify this.

I think lli takes the triple from the first input file. Does any of your IR have triples in it?

Yes, that code obviously is what picks the name. The IR doesn't have any triples in it - I'm running into this with e.g. https://github.com/llvm/llvm-project/blob/main/llvm/test/ExecutionEngine/MCJIT/test-loadstore.ll. When running without -jit-kind=mcjit, lli seems to ignore the triple specified with the -mtriple option, and it also doesn't react to changes to sys::getDefaultTargetTriple(). So where is the triple set that is used for code generation, when running that testcase with Orc? (I'm curious about where it picks up the msvc vs mingw difference, if it's not from either of the -mtriple option or sys:: getDefaultTargetTriple().)

lhames added a comment.May 9 2023, 4:35 PM

I see here that this succeeds since it finds an exported __chkstk symbol in kernelbase.dll that happens to be loaded. (Normally, nothing ever imports the __chkstk symbol from another DLL, it's always a statically linked minimal helper function.) So that explains why it works with the __chkstk name but not for __chkstk_ms.

I guess making lli export symbols for them to be found at runtime, might be one solution, let me think about that a bit...

Sounds good. We could be more selective and only add the addresses for a known set of functions to the JIT (this is similar to the "ORC-runtime-lite" idea, but applied to the runtime libraries. It's a bit a hack, but an easy and practical one if it works).

Are there any options I could inject here, to make it more verbose about what it does, to inspect the code generation vs linking cases here?

As the -mtriple= option didn't seem to make any difference, I also tried modifying sys::getDefaultTargetTriple() to return an MSVC style triple, but even then, the MinGW built lli.exe fails with references to the MinGW style named ___chckstk_ms symbol. So there's something in the Orc jit linking step that always goes for the MinGW style name here. What do I do to gain insight into what's actually going on in the Orc jit linking step?

I don't think the link step is going to be relevant here: I'm pretty sure the difference is in X86TargetLowering::getStackProbeSymbolName:

// We need a stack probe to conform to the Windows ABI. Choose the right                                                                                                                                                                                                                               
// symbol.                                                                                                                                                                                                                                                                                             
if (Subtarget.is64Bit())
  return Subtarget.isTargetCygMing() ? "___chkstk_ms" : "__chkstk";

You could break on this function and check the return value to verify this.

I think lli takes the triple from the first input file. Does any of your IR have triples in it?

Yes, that code obviously is what picks the name. The IR doesn't have any triples in it - I'm running into this with e.g. https://github.com/llvm/llvm-project/blob/main/llvm/test/ExecutionEngine/MCJIT/test-loadstore.ll. When running without -jit-kind=mcjit, lli seems to ignore the triple specified with the -mtriple option, and it also doesn't react to changes to sys::getDefaultTargetTriple(). So where is the triple set that is used for code generation, when running that testcase with Orc? (I'm curious about where it picks up the msvc vs mingw difference, if it's not from either of the -mtriple option or sys:: getDefaultTargetTriple().)

Oh -- It's using sys::getProcessTriple() rather than sys::getDefaultTargetTriple(). I think the failure to honor -mtriple should be treated as a bug here -- I'll add a patch to fix this.

The IR doesn't have any triples in it - I'm running into this with e.g. https://github.com/llvm/llvm-project/blob/main/llvm/test/ExecutionEngine/MCJIT/test-loadstore.ll. When running without -jit-kind=mcjit, lli seems to ignore the triple specified with the -mtriple option, and it also doesn't react to changes to sys::getDefaultTargetTriple(). So where is the triple set that is used for code generation, when running that testcase with Orc? (I'm curious about where it picks up the msvc vs mingw difference, if it's not from either of the -mtriple option or sys:: getDefaultTargetTriple().)

Oh -- It's using sys::getProcessTriple() rather than sys::getDefaultTargetTriple().

Ah, I see - thanks, that explains it.

I think the failure to honor -mtriple should be treated as a bug here -- I'll add a patch to fix this.

Maybe, but do note that there's a subtle existing tweak for MCJIT, it uses triples like x86_64-pc-windows-msvc-elf, set up here: https://github.com/llvm/llvm-project/blob/main/llvm/test/lit.cfg.py#L96-L102 If we start honoring that when not using -jit-kind=mcjit, the tests that run with Orc end up doing and testing a totally different thing, I'm afraid...

…or http://45.33.8.238/win/78404/step_11.txt, which might load faster.

I've reverted it in b21909c0fc6adc62c2077037fcc8eec5f7cdbdd7 for now, as the win bots are red due to multiple stacked failures and it'd be good to have a green build before the next regression :)

I think the failure to honor -mtriple should be treated as a bug here -- I'll add a patch to fix this.

Maybe, but do note that there's a subtle existing tweak for MCJIT, it uses triples like x86_64-pc-windows-msvc-elf, set up here: https://github.com/llvm/llvm-project/blob/main/llvm/test/lit.cfg.py#L96-L102 If we start honoring that when not using -jit-kind=mcjit, the tests that run with Orc end up doing and testing a totally different thing, I'm afraid...

Thanks for pointing this out. I'll sink that argument down into the MCJIT tests.

For the new tests: Do Cygwin / MinGW use ELF or COFF by default? COFF support has improved a lot, so if that's the default we should try to just use it.

Maybe, but do note that there's a subtle existing tweak for MCJIT, it uses triples like x86_64-pc-windows-msvc-elf, set up here: https://github.com/llvm/llvm-project/blob/main/llvm/test/lit.cfg.py#L96-L102 If we start honoring that when not using -jit-kind=mcjit, the tests that run with Orc end up doing and testing a totally different thing, I'm afraid...

Thanks for pointing this out. I'll sink that argument down into the MCJIT tests.

For the new tests: Do Cygwin / MinGW use ELF or COFF by default? COFF support has improved a lot, so if that's the default we should try to just use it.

Cygwin and MinGW use COFF, so they should get whatever behavior you have for the MSVC targets.

If you have a tentative patch for sinking that argument into the MCJIT cases only, I could try taking that for a spin (together with reapplying the patch for honoring -mtriple) with both MSVC and MinGW environments before landing it.

mstorsjo abandoned this revision.Jul 28 2023, 1:21 PM

This wasn't needed in the end, the issue was solved differently.