This is an archive of the discontinued LLVM Phabricator instance.

[Clang][OpenMP] Fixed an issue that clang crashed when compiling OpenMP program in device only mode without host IR
ClosedPublic

Authored by tianshilei1992 on Jan 16 2021, 7:00 PM.

Details

Summary

D94745 rewrites the deviceRTLs using OpenMP and compiles it by directly
calling the device compilation. clang crashes because entry in
OffloadEntriesDeviceGlobalVar is unintialized. Current design supposes the
device compilation can only be invoked after host compilation with the host IR
such that clang can initialize OffloadEntriesDeviceGlobalVar from host IR.
This avoids us using device compilation directly, especially when we only have
code wrapped into declare target which are all device code. The same issue
also exists for OffloadEntriesInfoManager.

In this patch, we simply initialized an entry if it is not in the maps. Not sure
we need an option to tell the device compiler that it is invoked standalone.

Diff Detail

Event Timeline

tianshilei1992 created this revision.Jan 16 2021, 7:00 PM
tianshilei1992 requested review of this revision.Jan 16 2021, 7:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2021, 7:00 PM

Also updated for registerTargetRegionEntryInfo

tianshilei1992 edited the summary of this revision. (Show Details)Jan 17 2021, 11:20 AM
tianshilei1992 added a project: Restricted Project.

Here is a minimal reproducer for the issue even without standalone device compilation (and I can build the situation in various other ways as well): https://godbolt.org/z/T1h9b5
Can you add it as a test, I am also curious how the IR looks like with this patch.

Added a new test case for device only compilation

ABataev added inline comments.Jan 18 2021, 4:23 AM
clang/lib/CodeGen/CGOpenMPRuntime.cpp
2944–2948

I would add a chack that to auxiliary device was specified. And if it was specified, it means this is not device-only mode and still need to emit an error.

jdoerfert accepted this revision.Jan 18 2021, 2:17 PM

LGTM, one nit in the test below.

clang/lib/CodeGen/CGOpenMPRuntime.cpp
2944–2948

No it doesn't. There is nothing wrong with https://godbolt.org/z/T1h9b5, and as I said before, I can build the situation in various other ways as well, some of which will be outside of the users control. A global can exist in the host/device code only.

clang/test/OpenMP/declare_target_device_only_compilation.cpp
14

Run this as regular target offloading as well. no -fopenmp-is-device necessary, at least not in https://godbolt.org/z/T1h9b5.

This revision is now accepted and ready to land.Jan 18 2021, 2:17 PM

Oh, and fix the test. If you have -verify and no errors you need // expected-no-diagnostics

clang/lib/CodeGen/CGOpenMPRuntime.cpp
2944–2948

I'm not saying that this is wrong. This code was used to check that the compiler works correctly and it just allows developer to understand that there is a problem with the compiler if it misses something and there is a difference between host and device codegens. If we don't want to emit an error here, still would be good to have something like an assert to be sure that the host/device codegens are synced.

tianshilei1992 marked 3 inline comments as done.Jan 18 2021, 6:28 PM
tianshilei1992 added inline comments.
clang/lib/CodeGen/CGOpenMPRuntime.cpp
2944–2948

That check still doesn't work for the test case provided by @jdoerfert because host IR doesn't contain that global in the offload info.

tianshilei1992 marked an inline comment as done.Jan 18 2021, 6:29 PM

Fixed the test case and rebased

Removed the useless assertion

jdoerfert added inline comments.Jan 18 2021, 8:59 PM
clang/lib/CodeGen/CGOpenMPRuntime.cpp
2944–2948

As @tianshilei1992 says, my test case does show how this can never be an assertion/warning even for regular host+device compliation. There is no guarantee a host version exists, or a device one does. We need to gracefully allow either.

ABataev added inline comments.Jan 19 2021, 2:50 AM
clang/lib/CodeGen/CGOpenMPRuntime.cpp
2944–2948

So, this is caused by the nohost variant selector, right? In this case we don't emit it for the host and don't have corresponding entry in the metadata?

tianshilei1992 added inline comments.Jan 19 2021, 8:14 AM
clang/lib/CodeGen/CGOpenMPRuntime.cpp
2944–2948

Correct.

jdoerfert added inline comments.Jan 19 2021, 8:17 AM
clang/lib/CodeGen/CGOpenMPRuntime.cpp
2944–2948

FWIW, I can also use ifdef, or various other context selectors to achieve the same effect.

ABataev added inline comments.Jan 19 2021, 8:19 AM
clang/lib/CodeGen/CGOpenMPRuntime.cpp
2944–2948

Ok, now I see. Is it possible to distinguish this corner case from the general one, where the variable is available for both, host and device, or it will require some extra work/design? If it is easy to differentiate these 2 cases, it would be good to keep something like an error message/assert for the general case and remove the restriction for this corner case. Otherwise, proceed with the current solution.

2944–2948

Ok, then go ahead with this solution.

tianshilei1992 marked 6 inline comments as done.Jan 19 2021, 11:29 AM
tianshilei1992 added inline comments.
clang/lib/CodeGen/CGOpenMPRuntime.cpp
2944–2948

I'm thinking we can probably set a new option to invoke device only compilation, like --cuda-device-only we already had for CUDA. However, the thing is, unlike CUDA, we cannot tell which parts are kernels implicitly w/o host compilation. So if we provide the option, we might end up with only recognizing code in declare target or all its variant.

tianshilei1992 marked an inline comment as done.Jan 19 2021, 11:30 AM
ABataev added inline comments.Jan 19 2021, 11:39 AM
clang/lib/CodeGen/CGOpenMPRuntime.cpp
2944–2948

I think the host part is not really required here. IIRC, it is used currently only to check that the host code and the device code are synced and device part matches the host codegen. If we need device-only codegen, we can just turn off this check. At least, you can try.