This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][mingw] Fix build for aarch64 target
ClosedPublic

Authored by alvinhochun on Nov 1 2022, 9:26 AM.

Details

Summary

This patch changes AArch64 + __GNUC__ to use __sync builtins to
implement internal atomic macros just like for Unix, because mingw-w64
is missing some of the intrinsics which the MSVC codepath is using.

Then some remaining intel-only functions are removed from dllexport to
fix linking.

This should fix https://github.com/llvm/llvm-project/issues/56349.

Diff Detail

Event Timeline

alvinhochun created this revision.Nov 1 2022, 9:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 1 2022, 9:26 AM
alvinhochun requested review of this revision.Nov 1 2022, 9:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 1 2022, 9:26 AM

Adding two more reviewers from MSVC who have recently worked on OpenMP.

natgla added inline comments.Nov 1 2022, 1:36 PM
openmp/runtime/src/kmp_os.h
621

For MSVC we compiled libomp only with MSVC, so the patch is a bit different. @branh will start upstreaming the changes later this month, they will be complimentary to this change.
Am I right that this patch will fix the issues only if compiled with gcc?
looks good to me.

FWIW I tested this patch for a mingw/aarch64 environment, and at least a very small testcase does seem to run as expected.

openmp/runtime/src/dllexports
1265

@natgla - What about this change - was this something which was missed/forgotten and/or just not yet upstreamed from your work on openmp on MSVC/arm64?

openmp/runtime/src/kmp_os.h
621

This patch fixes issues if compiled with clang in mingw mode - gcc doesn't support windows on aarch64 (yet).

natgla added inline comments.Nov 1 2022, 1:59 PM
openmp/runtime/src/dllexports
1265

this part wasn't broken for us - because libomp140 is still based on LLVM 11.

openmp/runtime/src/kmp_os.h
621

Is this: "defined(__ GNUC __)" for mingw then?

branh added inline comments.Nov 1 2022, 2:03 PM
openmp/runtime/src/dllexports
1265

We haven't yet implemented OpenMP 5.1 on MSVC. We're finishing up OpenMP 3.1 support now.

For OpenMP 3.1, we did change this file to allow the various flavors of __kmpc_atomic_<type>_<opt>_cpt in the OpenMP 3.1 section on arm64. LLVM inlines the atomic lock acquisition/etc. instead, and so doesn't need these calls, but we haven't implemented that in MSVC yet.

It looks like your change is *removing* declarations for certain _cpt and _cas functions on arm64. If you expected us to need these functions on MSVC/arm64, wouldn't the current state be what we need?

mstorsjo added inline comments.Nov 1 2022, 2:07 PM
openmp/runtime/src/dllexports
1265

Ah, I see, these functions were added after that patch (4fb0aaf03381473ec8af727edb4b5d59b64b0d60 / D101173) was upstreamed.

1265

Currently, linking of the OpenMP DLL fails, since this list declares that these functions are to be exported - but since there's no implementation of them for Windows/ARM64, linking fails. This part of the patch excludes them from being exported.

If it's intended to implement them later, I guess this should be a separate ifdef block, with a comment saying that this just is a temporary measure and that they are expected to be implemented later.

openmp/runtime/src/kmp_os.h
621

Yes, that's the correct compiler define to look for; clang identifies itself both as __GNUC__ and __clang__ on most platforms (except for MSVC targets) - it does that for mingw targets too.

natgla added inline comments.Nov 1 2022, 2:18 PM
openmp/runtime/src/kmp_os.h
621

For clang most of this file is using KMP_COMPILER_CLANG

mstorsjo added inline comments.Nov 1 2022, 2:20 PM
openmp/runtime/src/kmp_os.h
621

Sure - but that would limit the new codepaths to clang only, right? The used __sync_* intrinsics are GCC intrinsics that Clang also supports. If someone was to build this code for Windows on ARM64 with a future GCC that supports this target, this would be the right codepath to use in that case too (there's nothing clang-specific that GCC can't handle here).

Then again I don't know if this openmp implementation expects to support GCC overall at all?

natgla added inline comments.Nov 1 2022, 2:28 PM
openmp/runtime/src/kmp_os.h
621

that I don't know, will leave for other reviewers to comment on

alvinhochun edited the summary of this revision. (Show Details)

Changed to make AArch64 + Windows + GNUC reuse existing macros implemented with
__sync builtins instead of making a copy of them. This should be functionally
equivalent because mingw-w64 implements a lot of the Interlocked* functions
with the __sync builtins anyway.

alvinhochun added inline comments.
openmp/runtime/src/dllexports
1265

I excluded these functions based on D110109 (which implements them) stating they are x86/x86_64-only.

natgla added a comment.Nov 2 2022, 2:51 PM

FWIW I tested this patch for a mingw/aarch64 environment, and at least a very small testcase does seem to run as expected.

Did you try more tests, with different atomic pragmas?

natgla accepted this revision.Nov 2 2022, 2:53 PM
This revision is now accepted and ready to land.Nov 2 2022, 2:53 PM

FWIW I tested this patch for a mingw/aarch64 environment, and at least a very small testcase does seem to run as expected.

Did you try more tests, with different atomic pragmas?

Not yet; I’m working on setting up so I can run the lit tests (check-openmp) though.

FWIW I tested this patch for a mingw/aarch64 environment, and at least a very small testcase does seem to run as expected.

Did you try more tests, with different atomic pragmas?

Not yet; I’m working on setting up so I can run the lit tests (check-openmp) though.

I've finally managed to somehow cobble things together so I can run the lit tests for this configuration: it mostly seems to work, but there are a couple failures that I probably should look into (although I don't know if they're related to what's being changed here or whether it's a later issue.

Anyway, with x86_64 clang-cl/msvc mode, there's 4 legitimate test failures among the tests; with x86_64 mingw there's those 4 plus yet another one failing. With aarch64 mingw there's 9 more failing tests (but still 180 tests passing).

natgla added a comment.Nov 7 2022, 7:16 AM

I've finally managed to somehow cobble things together so I can run the lit tests for this configuration: it mostly seems to work, but there are a couple failures that I probably should look into (although I don't know if they're related to what's being changed here or whether it's a later issue.

Anyway, with x86_64 clang-cl/msvc mode, there's 4 legitimate test failures among the tests; with x86_64 mingw there's those 4 plus yet another one failing. With aarch64 mingw there's 9 more failing tests (but still 180 tests passing).

We never tested this configuration. I'd suggest to look at the additional failures, and if they are related to atomics, to investigate them (otherwise consider them to be a different issue). If you can share all logs, I'd be curious to look at the results too (it won't save you any time though)

I've finally managed to somehow cobble things together so I can run the lit tests for this configuration: it mostly seems to work, but there are a couple failures that I probably should look into (although I don't know if they're related to what's being changed here or whether it's a later issue.

Anyway, with x86_64 clang-cl/msvc mode, there's 4 legitimate test failures among the tests; with x86_64 mingw there's those 4 plus yet another one failing. With aarch64 mingw there's 9 more failing tests (but still 180 tests passing).

We never tested this configuration. I'd suggest to look at the additional failures, and if they are related to atomics, to investigate them (otherwise consider them to be a different issue). If you can share all logs, I'd be curious to look at the results too (it won't save you any time though)

I managed to sort out all the test failures - many of them were caused by using slight mismatches in version between the sources to build libomp.dll and for running tests, some caused by using a slightly older version of clang to build.

I'll go ahead and push this one then, and send patches for other issues I've sorted out. I'll be happy to CC you on those patches too.

This revision was automatically updated to reflect the committed changes.

I managed to sort out all the test failures - many of them were caused by using slight mismatches in version between the sources to build libomp.dll and for running tests, some caused by using a slightly older version of clang to build.

Sorry, I realized I had only been testing it successfully in a Debug build - when built in Release mode there still are a couple failures that I need to sort out. Anyway, I'll let you know if I find anything relevant, and post patches when I have it sorted out.

I managed to sort out all the test failures - many of them were caused by using slight mismatches in version between the sources to build libomp.dll and for running tests, some caused by using a slightly older version of clang to build.

Sorry, I realized I had only been testing it successfully in a Debug build - when built in Release mode there still are a couple failures that I need to sort out. Anyway, I'll let you know if I find anything relevant, and post patches when I have it sorted out.

I've sorted out these issues now - with one upcoming patch I'll send in a moment (plus the rest of the ones that are in review), all tests pass in the mingw/aarch64 configuration.

There's one test, worksharing/for/omp_for_schedule_guided.c, which seems to be fail spuriously though (both on aarch64, and on arm, which requires another patch), but mostly the tests do pass. I'm not sure if that one flaky tests do indicate potential issues with atomics somewhere, or what it is, since it's rather hard to debug when it works 4 times out of 5.