This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Allow exceptions in target regions when offloading to GPUs
ClosedPublic

Authored by AntonRydahl on Jun 27 2023, 2:46 PM.

Details

Summary

The motivation for this patch is that many code bases use exception handling. As GPUs are not expected to support exception handling in the near future, we can experiment with compiling the code for GPU targets anyway. This will
allow us to run the code, as long as no exception is thrown.

The overall idea is very simple:

  • If a throw expression is compiled to AMDGCN or NVPTX, it is replaced with a trap during code generation.
  • If a try/catch statement is compiled to AMDGCN or AMDHSA, we ganerate code for the try statement as if it were a basic block.

While I have almost not modified any code, I have made tests that verify that we still do not allow compilation of exceptions, unless the user explicitly enables -fcxx-exceptions or -fexceptions.

Please let me know what you think of this patch and if the warnings could be improved.

Example

With this patch, the compilation of the following example

{C++}
#include <iostream>
#pragma omp declare target
int gaussian_sum(int a,int b){
	if ((a + b) % 2 == 0) {throw -1;};
	return (a+b) * ((a+b)/2);
}
#pragma omp end declare target

int main(void) {
	int gauss = 0;
	#pragma omp target map(from:gauss)
	{
		try {
			gauss = gaussian_sum(1,100);
		}
		catch (int e){
			gauss = e;
		}
	}
	std::cout << "GaussianSum(1,100)="<<gauss<<std::endl;
        #pragma omp target map(from:gauss)
        {
                try {
                     	gauss = gaussian_sum(1,101);
                }
                catch (int e){
                        gauss = e;
                }
        }
	std::cout << "GaussianSum(1,101)="<<gauss<<std::endl;
	return (gauss > 1) ? 0 : 1;
}

with offloading to gfx906 results in

{bash}
./bin/target_try_minimal_fail      
GaussianSum(1,100)=5050
AMDGPU fatal error 1: Received error in queue 0x155555506000: HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception.
zsh: abort (core dumped)

Issues

The patch make 11 of the tests from clang/test/OpenMP fail. It seems to be related only with the values of Opts.Exceptions and Opts.CXXExceptions. I have tested that this change alone breaks the aforementioned tests. It would
be nice if somebody with a better knowledge of Clang and NVPTX would help me understand to what degree I have done something wrong, and when, if in any cases, it would be allowed to modify the tests.

Diff Detail

Event Timeline

AntonRydahl created this revision.Jun 27 2023, 2:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 2:46 PM
AntonRydahl requested review of this revision.Jun 27 2023, 2:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 2:46 PM
AntonRydahl edited the summary of this revision. (Show Details)Jun 27 2023, 2:48 PM
jdoerfert added inline comments.Jun 27 2023, 8:27 PM
clang/include/clang/Basic/DiagnosticCommonKinds.td
437

Check the style of other messages, they have a single sentence and start with a lower case letter.
Also, the explanation doesn't make sense for users. They don't know about traps, basic blocks, etc.
Maybe:
target '%0' does not support exception handling; 'throw' is assumed to be never reached
target '%0' does not support exception handling; 'catch' block is ignored
and nothing for try.
Finally, these need a group so users can also disable them.

clang/lib/CodeGen/CGException.cpp
12

unrelated, please add back

460–461

You should be able to return after the EmitTrap to simplify the code.

625

Check the LLVM style guide, or the surrounding code, for variable naming.

clang/lib/Frontend/CompilerInvocation.cpp
3837 ↗(On Diff #535148)

I don't get this. Is this needed?

clang/test/OpenMP/target_throw_message.cpp
6 ↗(On Diff #535148)

It is very suspicious that you cannot use cc1 and verify. Also, various options are odd.
I don't think we want the driver here, but we want cc1. We need to find out why this is not working.

Updated the tests to use %clang_cc1 -verify. There are still things that need to be improved.

AntonRydahl marked 2 inline comments as done.Aug 2 2023, 4:20 PM
AntonRydahl added inline comments.
clang/lib/CodeGen/CGException.cpp
625

Is it better now? Variables have to use camel case and start with an uppercase character, right?

clang/lib/Frontend/CompilerInvocation.cpp
3837 ↗(On Diff #535148)

The way it is right now before this patch, we are overwriting the users command line arguments. So -fexceptions and fcxx-exceptions will have no effect on the device. But we could introduce a new command line option instead.

AntonRydahl added inline comments.Aug 2 2023, 4:23 PM
clang/include/clang/Basic/DiagnosticCommonKinds.td
437

Is the group name ok? Now the warnings can be enabled with -Wopenmp-target-exception and disabled with -Wno-openmp-target-exception.

jdoerfert retitled this revision from Allowing exception handling in OpenMP target regions when offloading to AMDGCN or NVPTX targets to [OpenMP] Allow exceptions in target regions when offloading to GPUs.Aug 2 2023, 5:39 PM

Avoided changing clang/lib/Frontend/CompilerInvocation.cpp by performing checks for OpenMP offloading to GPU devices in Sema instead.

Rebased on upstream LLVM to check if flang CI is still affected by this
commit.

AntonRydahl marked an inline comment as done.Aug 18 2023, 10:00 AM

CI tests failed on the built-bot, but that was not related to this patch. I patched the file that made the built-bot fail and rebased this differential after the patch.

I think this is basically done, one question though.

clang/test/OpenMP/amdgpu_exceptions.cpp
12

Can we use /dev/null? Do other tests use it? I would expect -analyze or sth instead.

jhuber6 added inline comments.Aug 19 2023, 1:01 PM
clang/test/OpenMP/amdgpu_exceptions.cpp
12

There's other tests, but I think that requires the shell or linux.

Moved emission of warnings from CodeGen to Sema to allow using -verify -analyze.

AntonRydahl marked 2 inline comments as done.Aug 28 2023, 1:01 PM
AntonRydahl added inline comments.
clang/test/OpenMP/amdgpu_exceptions.cpp
12

Thanks for the good point! It did not work with -analyze because I emitted the warnings during code generation. Now the warnings are emitted during semantic analysis, and -analyze can replace -o - &> /dev/null.

jdoerfert accepted this revision.Aug 28 2023, 1:03 PM

LG, see below.

clang/lib/Sema/SemaExprCXX.cpp
867–872
clang/lib/Sema/SemaStmt.cpp
4474–4479
This revision is now accepted and ready to land.Aug 28 2023, 1:03 PM
AntonRydahl marked 2 inline comments as done.

Replaced instances of const llvm::Triple T with const llvm::Triple &T.

AntonRydahl marked 2 inline comments as done.Aug 28 2023, 1:34 PM
AntonRydahl added inline comments.
clang/lib/Sema/SemaExprCXX.cpp
867–872

Good point! They should of course both be aliases.

Rebased on main to see if libc++ CI is still failing.

The libcxx tests are always broken randomly in my experience. I wouldn't worry about it.

The libcxx tests are always broken randomly in my experience. I wouldn't worry about it.

Thanks a bunch! I did not know that. Shall I just land the patch?

AntonRydahl reopened this revision.Aug 29 2023, 9:38 AM
This revision is now accepted and ready to land.Aug 29 2023, 9:38 AM

After landing this patch, two of the unit tests broke aarch64 and Windows build bots. To avoid that, // REQUIRES: was added to the unit tests to ensure that they only run when the target triple is supported.

This revision was landed with ongoing or failed builds.Aug 29 2023, 3:07 PM
This revision was automatically updated to reflect the committed changes.
AntonRydahl reopened this revision.Aug 29 2023, 4:32 PM
This revision is now accepted and ready to land.Aug 29 2023, 4:32 PM

Updating the line numbers in the unit tests to match the addition of // REQUIRES:

This revision was landed with ongoing or failed builds.Aug 30 2023, 9:39 AM
This revision was automatically updated to reflect the committed changes.
aeubanks added inline comments.
clang/test/OpenMP/amdgpu_exceptions.cpp
10

I believe tests using -analyze need REQUIRES: staticanalyzer

hans added a subscriber: hans.Aug 31 2023, 12:38 AM
hans added inline comments.
clang/test/OpenMP/amdgpu_exceptions.cpp
10

Right, otherwise these fail in builds confiugured with -DCLANG_ENABLE_STATIC_ANALYZER=OFF.

Adding the requirement in 1968f0d7981df2d508c7c862d875b115837208b3.

Thanks a lot for pointing this out and for fixing my mistakes, Arthur and Hans. I am sorry for the inconvenience!

thakis added a subscriber: thakis.Sep 11 2023, 2:13 PM
thakis added inline comments.
clang/test/OpenMP/amdgpu_throw_trap.cpp
4

This test fails if X86 isn't in LLVM_TARGETS_TO_BUILD and the host system is some non-x86 system (e.g. arm64).

(This is the only test in check-clang that fails then.)

Should this test grow a REQUIRES: x86-registered-target? Should it use %itanium_abi_triple instead of x86_64-pc-linux-gnu? (It seems to pass when replacing x86_64-pc-linux-gnu with %itanium_abi_triple on my arm mac.)

aeubanks added inline comments.Sep 11 2023, 2:30 PM
clang/test/OpenMP/amdgpu_throw_trap.cpp
4

added x86-registered-target in 238a1ef44f4f2361205e538b3cb7ebc5ec70894d

thakis added inline comments.Sep 11 2023, 2:32 PM
clang/test/OpenMP/amdgpu_throw_trap.cpp
4

Is that better than %itanium_abi_triple?

Thanks for fixing my tests! This is the second time I messed up the requirements. I am very sorry about that!

aeubanks added inline comments.Sep 11 2023, 2:57 PM
clang/test/OpenMP/amdgpu_throw_trap.cpp
4

I was worried about LLVM failing if the calculated %itanium_abi_triple wasn't supported in that build of LLVM, but TIL that clang/LLVM can handle triples it doesn't recognize all the way until the codegen phase. But IIUC optimizations can change depending on whether or not LLVM recognizes the triple so it's still a little inconsistent.

so yeah %itanium_abi_triple would probably work, but it seems susceptible to configuration differences

thakis added inline comments.Sep 11 2023, 3:03 PM
clang/test/OpenMP/amdgpu_throw_trap.cpp
4

In return, the test then runs on arm machines if you have the x86 target disabled.

But I'm happy as-is too, and to be honest with x86 target disabled so many tests don't run that it can't _really_ be used for development anyways 😛