This is an archive of the discontinued LLVM Phabricator instance.

[flang][runtime] Avoid dependency on libc++ for `std::__libcpp_verbose_abort`
ClosedPublic

Authored by mmuetzel on Aug 27 2023, 11:38 AM.

Details

Summary

Changes in libc++ during the development cycle for LLVM 17 lead to the FortranRuntime library depending on libc++.

Trying to build with Flang 17 that was built with clang++ 17 and libc++ 17 (on MinGW) leads to the following linker error:

ld.lld: error: undefined symbol: std::__1::__libcpp_verbose_abort(char const*, ...)
>>> referenced by libFortranRuntime.a(io-api.cpp.obj):(std::__1::__throw_bad_variant_access[abi:v170000]())
>>> referenced by libFortranRuntime.a(io-stmt.cpp.obj)
>>> referenced by libFortranRuntime.a(unit.cpp.obj)

That might be caused by std::get being called on a std::variant in common::visit.

std::__libcpp_verbose_abort is a weak symbol in libc++ that can be optionally replaced by an
alternative definition in user code (see: [1])

Do that to avoid a dependency of the FortranRuntime on libc++.

[1]: https://libcxx.llvm.org/UsingLibcxx.html#overriding-the-default-termination-handler

See also: https://github.com/msys2/MINGW-packages/pull/18002#issuecomment-1694412640

Diff Detail

Event Timeline

mmuetzel created this revision.Aug 27 2023, 11:38 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 27 2023, 11:38 AM
Herald added a subscriber: mstorsjo. · View Herald Transcript
mmuetzel requested review of this revision.Aug 27 2023, 11:38 AM

Unfortunately, I don't think this fix can go in in this form. We can't go on defining C++ standard symbols here. And if we'd be doing this, this is only the libc++ specific symbol, while libstdc++ or STL would have a different symbol.

The main question is where this reference comes from. As far as I'm aware, the flang runtime isn't supposed to rely on the C++ standard library. Where does the reference come from? And why doesn't this turn out to be an issue on other platforms?

I might be misunderstanding how that is supposed to work. But according to this, users are meant to be able to override this specific symbol from libc++:
https://libcxx.llvm.org/UsingLibcxx.html#overriding-the-default-termination-handler

I might be misunderstanding how that is supposed to work. But according to this, users are meant to be able to override this specific symbol from libc++:
https://libcxx.llvm.org/UsingLibcxx.html#overriding-the-default-termination-handler

Right, so this specific symbol is something that user code is allowed to override, when using libc++. I'm quite uneasy in having C++-library-agnostic code defining functions specific for one C++ library implementation. (Yes, admittedly, defining this if using another C++ standard library is probably harmless.)

The main question to me is that, as far as I know, the flang runtime isn't supposed to rely on the C++ runtime. Right?

Yet there is something in the C++ headers that generates this reference to __libcpp_verbose_abort in that case. As start to figuring out where this dependency creeps in - I see in the error message >>> referenced by libFortranRuntime.a(io-api.cpp.obj):(std::__1::__throw_bad_variant_access[abi:v170000]()) (and a couple other object files). So, what produces the calls to __throw_bad_variant_access? I guess this is generated from header inline code. From grepping libc++ headers, I find calls to __throw_bad_variant_access in the <variant> header - and the flang runtime indeed seems to be using <variant>.

So the situation is that the flang runtime isn't supposed to depend on the standard C++ library - yet the flang runtime uses <variant> from the C++ standard library - I guess this is done while hoping that this ends up as all inline code not producing any references to symbols that would need to be linked from the C++ library. But I presume that no C++ standard library gives any guarantees about what things you can or cannot use from headers without actually linking to the C++ library. And the flang runtime should work with any of the C++ standard libraries anyway.

I wonder why this isn't an issue in other cases. Perhaps it will be when more toolchains upgrade to a new enough version of libc++? For those cases, I guess this is a somewhat harmless workaround - but IMO it's just a brittle workaround for the flang runtime using parts of the C++ library while not wanting to link against it.

mmuetzel updated this revision to Diff 553890.Aug 28 2023, 4:29 AM

It looks like there are some protections in place that prohibit the user to add functions to the std namespace that haven't been declared in the standard headers.

I hope that checking whether _LIBCPP_AVAILABILITY_HAS_NO_VERBOSE_ABORT is defined is a good condition to prevent that from happening.

I don't know why this general issue isn't happening in other cases.

Would it help to add a TODO comment that this is just a work-around for the more general issue?

Still fails in the CI with:

[1606/4861] Building CXX object tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/io-api.cpp.o
FAILED: tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/io-api.cpp.o
CCACHE_CPP2=yes CCACHE_HASHDIR=yes /usr/bin/ccache /usr/bin/clang++ -DBUILD_EXAMPLES -DFLANG_INCLUDE_TESTS=1 -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_LIBCPP_ENABLE_HARDENED_MODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/var/lib/buildkite-agent/builds/llvm-project/build/tools/flang/runtime -I/var/lib/buildkite-agent/builds/llvm-project/flang/runtime -I/var/lib/buildkite-agent/builds/llvm-project/flang/include -I/var/lib/buildkite-agent/builds/llvm-project/build/tools/flang/include -I/var/lib/buildkite-agent/builds/llvm-project/build/include -I/var/lib/buildkite-agent/builds/llvm-project/llvm/include -isystem /var/lib/buildkite-agent/builds/llvm-project/llvm/../mlir/include -isystem /var/lib/buildkite-agent/builds/llvm-project/build/tools/mlir/include -isystem /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/include -isystem /var/lib/buildkite-agent/builds/llvm-project/llvm/../clang/include -gmlt -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-string-conversion -Wno-ctad-maybe-unsupported -Wno-unused-command-line-argument -Wstring-conversion           -Wcovered-switch-default -Wno-nested-anon-types -fno-lto -O3 -DNDEBUG   -U_GLIBCXX_ASSERTIONS -U_LIBCPP_ENABLE_ASSERTIONS -UNDEBUG -std=c++17  -fno-exceptions -funwind-tables -fno-rtti -MD -MT tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/io-api.cpp.o -MF tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/io-api.cpp.o.d -o tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/io-api.cpp.o -c /var/lib/buildkite-agent/builds/llvm-project/flang/runtime/io-api.cpp
/var/lib/buildkite-agent/builds/llvm-project/flang/runtime/io-api.cpp:1523:11: error: out-of-line definition of '__libcpp_verbose_abort' does not match any declaration in namespace 'std'
void std::__libcpp_verbose_abort(char const* format, ...) {
          ^~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Maybe, I misinterpreted that error.
Why does it reportedly work for MinGW?

I basically copied the example from the libc++ documentation. Does that documentation need an update?

Would it help to add a TODO comment that this is just a work-around for the more general issue?

I'd leave that up to Flang maintainers (I'm not really involved enough to have a say in that) whether this is a reasonable thing to do, or what the right way forward is.

Still fails in the CI with:

Maybe, I misinterpreted that error.
Why does it reportedly work for MinGW?

I basically copied the example from the libc++ documentation. Does that documentation need an update?

There's no guarantee that that CI instance runs with the latest libc++ - it may be running on an older installation of Clang/libc++ that doesn't have __libcpp_verbose_abort yet. Or it might be running on top of libstdc++?

mmuetzel updated this revision to Diff 553893.EditedAug 28 2023, 5:03 AM

Looking at the history of libcxx/include/__verbose_abort the macro _LIBCPP_VERBOSE_ABORT was introduced in the same changeset that made std::__libcpp_verbose_abort a weak symbol (from an inlined function).

That might be the point where that dependency of the FortranRuntime to libc++ first appeared.
Let's see if making this conditional on the same macro pleases the CI.

mmuetzel updated this revision to Diff 553899.Aug 28 2023, 5:21 AM

Seems to have worked.

Now with clang-format.

Adding some more flang people to have a look at this one. The issue is that the flang runtime is using the <variant> header from the C++ standard library, while the flang runtime generally doesn't want to have a link-time dependency on the C++ standard library, as far as I know. (At least when linking, the clang/flang driver that drives the link doesn't link in any C++ standard library for MinGW targets.)

With current libc++ headers, the use of <variant> ends up with an undefined reference to std::__libcpp_verbose_abort, which is something that libc++ itself provides but the user also is allowed to provide their own to override it.

Is it known which files in flang/runtime produce this dependency on __libcpp_verbose_abort()? Or which functions? The runtime never uses std::get() on a variant, so there's no chance of that going wrong.

If the dependency on __libcpp_verbose_abort() arises from just a few code sites, there's a good chance that we could adjust them to avoid the problem.

mmuetzel added a comment.EditedSep 1 2023, 8:57 AM

Is it known which files in flang/runtime produce this dependency on __libcpp_verbose_abort()? Or which functions? The runtime never uses std::get() on a variant, so there's no chance of that going wrong.

If the dependency on __libcpp_verbose_abort() arises from just a few code sites, there's a good chance that we could adjust them to avoid the problem.

I guess one of the places could be here (where _u is a std::variant afaict):
https://github.com/llvm/llvm-project/blob/1c35c1a73907a95ce54b5a0edca513591e2bc069/flang/runtime/io-stmt.h#L113-L122

Is it known which files in flang/runtime produce this dependency on __libcpp_verbose_abort()? Or which functions? The runtime never uses std::get() on a variant, so there's no chance of that going wrong.

If the dependency on __libcpp_verbose_abort() arises from just a few code sites, there's a good chance that we could adjust them to avoid the problem.

I guess one of the places could be here (where _u is a std::variant afaict):
https://github.com/llvm/llvm-project/blob/1c35c1a73907a95ce54b5a0edca513591e2bc069/flang/runtime/io-stmt.h#L113-L122

If so, then that's IMO a bug in whatever C++ compiler or STL that you're using, and a workaround would be justified.

Is there a chance that the call to __libcpp_verbose_abort() might be coming from the use of std::reference_wrapper<>::get() instead? I could imagine that an implementation of that member function might want to check for a bad or uninitialized internal pointer. If so, this usage of std::reference_wrapper<> could easily be changed into plain old pointers. (They are never null, and they would be C++ references (&) if those could be used in a std::variant<>.).

Also, if the problem is indeed originating from the use of std::reference_wrapper<> in io-stmt.h, they could be replaced with Fortran::common::Reference from flang/include/flang/Common/reference.h. I could throw a patch together for you to try if you think it might be of help to you.

Also, if the problem is indeed originating from the use of std::reference_wrapper<> in io-stmt.h, they could be replaced with Fortran::common::Reference from flang/include/flang/Common/reference.h. I could throw a patch together for you to try if you think it might be of help to you.

I'm unfortunately having trouble building LLVM currently. (Likely something local in my build environment.)
Maybe, @MehdiChinoune could test please?

If so, then that's IMO a bug in whatever C++ compiler or STL that you're using, and a workaround would be justified.

The problem showed up originally when trying to build Flang with clang++ and libc++ from LLVM 17-rc3 here:
https://github.com/msys2/MINGW-packages/pull/18002

Here's a patch that removes the usage of std::reference_wrapper from the runtime.

Last login: Fri Aug 25 09:59:14 on ttys000
pklausler@pklausler-mlt ~ % ssh rome1.pgi.net
pklausler@rome1.pgi.net's password: 
Last login: Fri Aug 25 09:58:00 2023 from 10.20.72.168
linux86-64 module support enabled using version 'latest'.
rome1 ~ λ scr


















diff --git a/flang/runtime/io-stmt.h b/flang/runtime/io-stmt.h
index fa432d07a680..128c137dd0f0 100644
--- a/flang/runtime/io-stmt.h
+++ b/flang/runtime/io-stmt.h
@@ -16,6 +16,7 @@
 #include "format.h"
 #include "internal-unit.h"
 #include "io-error.h"
+#include "flang/Common/reference.h"
 #include "flang/Common/visit.h"
 #include "flang/Runtime/descriptor.h"
 #include "flang/Runtime/io-api.h"
@@ -200,39 +201,39 @@ public:
   }
 
 private:
-  std::variant<std::reference_wrapper<OpenStatementState>,
-      std::reference_wrapper<CloseStatementState>,
-      std::reference_wrapper<NoopStatementState>,
-      std::reference_wrapper<
+  std::variant<common::Reference<OpenStatementState>,
+      common::Reference<CloseStatementState>,
+      common::Reference<NoopStatementState>,
+      common::Reference<
           InternalFormattedIoStatementState<Direction::Output>>,
-      std::reference_wrapper<
+      common::Reference<
           InternalFormattedIoStatementState<Direction::Input>>,
-      std::reference_wrapper<InternalListIoStatementState<Direction::Output>>,
-      std::reference_wrapper<InternalListIoStatementState<Direction::Input>>,
-      std::reference_wrapper<
+      common::Reference<InternalListIoStatementState<Direction::Output>>,
+      common::Reference<InternalListIoStatementState<Direction::Input>>,
+      common::Reference<
           ExternalFormattedIoStatementState<Direction::Output>>,
-      std::reference_wrapper<
+      common::Reference<
           ExternalFormattedIoStatementState<Direction::Input>>,
-      std::reference_wrapper<ExternalListIoStatementState<Direction::Output>>,
-      std::reference_wrapper<ExternalListIoStatementState<Direction::Input>>,
-      std::reference_wrapper<
+      common::Reference<ExternalListIoStatementState<Direction::Output>>,
+      common::Reference<ExternalListIoStatementState<Direction::Input>>,
+      common::Reference<
           ExternalUnformattedIoStatementState<Direction::Output>>,
-      std::reference_wrapper<
+      common::Reference<
           ExternalUnformattedIoStatementState<Direction::Input>>,
-      std::reference_wrapper<ChildFormattedIoStatementState<Direction::Output>>,
-      std::reference_wrapper<ChildFormattedIoStatementState<Direction::Input>>,
-      std::reference_wrapper<ChildListIoStatementState<Direction::Output>>,
-      std::reference_wrapper<ChildListIoStatementState<Direction::Input>>,
-      std::reference_wrapper<
+      common::Reference<ChildFormattedIoStatementState<Direction::Output>>,
+      common::Reference<ChildFormattedIoStatementState<Direction::Input>>,
+      common::Reference<ChildListIoStatementState<Direction::Output>>,
+      common::Reference<ChildListIoStatementState<Direction::Input>>,
+      common::Reference<
           ChildUnformattedIoStatementState<Direction::Output>>,
-      std::reference_wrapper<
+      common::Reference<
           ChildUnformattedIoStatementState<Direction::Input>>,
-      std::reference_wrapper<InquireUnitState>,
-      std::reference_wrapper<InquireNoUnitState>,
-      std::reference_wrapper<InquireUnconnectedFileState>,
-      std::reference_wrapper<InquireIOLengthState>,
-      std::reference_wrapper<ExternalMiscIoStatementState>,
-      std::reference_wrapper<ErroneousIoStatementState>>
+      common::Reference<InquireUnitState>,
+      common::Reference<InquireNoUnitState>,
+      common::Reference<InquireUnconnectedFileState>,
+      common::Reference<InquireIOLengthState>,
+      common::Reference<ExternalMiscIoStatementState>,
+      common::Reference<ErroneousIoStatementState>>
       u_;
 };

The quoted patch doesn't make any difference.

I tested compiling flang/runtime/edit-output.cpp with -S -emit-llvm, and there are references to std::__1::__throw_bad_variant_access in e.g. Fortran::runtime::io::EditIntegerOutput<1>(Fortran::runtime::io::IoStatementState&, Fortran::runtime::io::DataEdit const&, Fortran::common::HostSignedIntTypeHelper<(8)*(1)>::type). Didn't try to look up which use in the runtime source this maps to though.

The quoted patch doesn't make any difference.

I tested compiling flang/runtime/edit-output.cpp with -S -emit-llvm, and there are references to std::__1::__throw_bad_variant_access in e.g. Fortran::runtime::io::EditIntegerOutput<1>(Fortran::runtime::io::IoStatementState&, Fortran::runtime::io::DataEdit const&, Fortran::common::HostSignedIntTypeHelper<(8)*(1)>::type). Didn't try to look up which use in the runtime source this maps to though.

Thanks for checking. It's too bad it didn't clear it up. I'll look at the integer output formatting routine and see whether anything stands out.

I might have misunderstood your prior comment. But didn't you write that calling std::get on a std::variant would lead to this function being pulled in?
Isn't that the case with the template I pointed to earlier?
In io-stmt.h:

template <typename A> A *get_if() const {
  return common::visit(
      [](auto &x) -> A * {
        if constexpr (std::is_convertible_v<decltype(x.get()), A &>) {
          return &x.get();
        }
        return nullptr;
      },
      u_);
}

where u_ is a std::variant.
IIUC, common::visit is from flang/Common/visit.h:

template <typename VISITOR, typename... VARIANT>
inline auto visit(VISITOR &&visitor, VARIANT &&...u)
    -> decltype(visitor(std::get<0>(std::forward<VARIANT>(u))...)) {
  using Result = decltype(visitor(std::get<0>(std::forward<VARIANT>(u))...));
  if constexpr (sizeof...(u) == 1) {
    static constexpr std::size_t high{
        (std::variant_size_v<std::decay_t<decltype(u)>> * ...) - 1};
    return Log2VisitHelper<0, high, Result>(std::forward<VISITOR>(visitor),
        u.index()..., std::forward<VARIANT>(u)...);
  } else {
    // TODO: figure out how to do multiple variant arguments
    return ::std::visit(
        std::forward<VISITOR>(visitor), std::forward<VARIANT>(u)...);
  }
}

Wouldn't that call std::get on a std::variant?

Wouldn't that call std::get on a std::variant?

Yes, but only as the argument of a decltype(), so it should not matter. Perhaps that's the root of the problem.

Perhaps replacing common::visit with std::visit might dodge the problem, although it might raise others.

mmuetzel edited the summary of this revision. (Show Details)Sep 2 2023, 2:17 AM

I updated the summary to hopefully better describe the information (or assumptions) that we found.

With that in mind, is the proposed change a reasonable workaround?

I updated the summary to hopefully better describe the information (or assumptions) that we found.

With that in mind, is the proposed change a reasonable workaround?

It is, if using std::visit() doesn't dodge the problem in your build environment.

Re-set my build environment and finally managed to get to that point locally.

Replacing common::visit with std::visit in IoStatementState::get_if doesn't make a difference for me. Still the same error when trying to link to the FortranRuntime:

ld.lld: error: undefined symbol: std::__1::__libcpp_verbose_abort(char const*, ...)
>>> referenced by libFortranRuntime.a(io-api.cpp.obj):(std::__1::__throw_bad_variant_access[abi:v170000]())
>>> referenced by libFortranRuntime.a(io-stmt.cpp.obj)
>>> referenced by libFortranRuntime.a(unit.cpp.obj)
flang: error: linker command failed with exit code 1 (use -v to see invocation)

Could this still be considered to be included in LLVM 17?

With the scheduled date for Phabricator being set to read-only approaching (Oct-1 afaict), should I open a PR on GitHub with the proposed change?

@klausler: Sorry for bothering. Are there still open points before this can be considered for merging?

@klausler: Sorry for bothering. Are there still open points before this can be considered for merging?

No.

klausler accepted this revision.Oct 20 2023, 10:15 AM
This revision is now accepted and ready to land.Oct 20 2023, 10:15 AM

@klausler: Sorry for bothering. Are there still open points before this can be considered for merging?

No.

Thanks for accepting the changes.