This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Protect unrecogonized CUDA error code
ClosedPublic

Authored by ye-luo on Sep 19 2020, 8:31 PM.

Details

Summary

If an error code can not be recognized by cuGetErrorString, errStr remains null and causes crashing at DP() printing.
Protect this case.

Diff Detail

Event Timeline

ye-luo created this revision.Sep 19 2020, 8:31 PM
ye-luo requested review of this revision.Sep 19 2020, 8:31 PM
ye-luo updated this revision to Diff 293001.Sep 19 2020, 8:44 PM

I've noticed this happens at the end of the program when Libomptarget is being unregistered. It'd be nice to know what causes this but checking the case is definitely better than crashing.

The root cause is a known issue and I put up a bug report to track the status.
https://bugs.llvm.org/show_bug.cgi?id=47595
Anyway, this patch should be sufficient for users at the moment.

jhuber6 accepted this revision.Sep 21 2020, 8:48 AM

LGTM

This revision is now accepted and ready to land.Sep 21 2020, 8:48 AM

I also suggest to check nullptr before using errStr.

I also suggest to check nullptr before using errStr.

I think the errStr is only NULL on when it returns CUDA_FAILURE so it would be redundant.

tianshilei1992 accepted this revision.Sep 21 2020, 9:04 AM

I also suggest to check nullptr before using errStr.

I think the errStr is only NULL on when it returns CUDA_FAILURE so it would be redundant.

You're right. I double checked the CUDA document and it supports your comment.

Hold a second. I'm exploring a bit more in the error message.

ye-luo updated this revision to Diff 293197.Sep 21 2020, 9:33 AM

After a bit more experiment, the return status of cuGetErrorString can be more than CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE.
In this particular case when the CUDA is deinitialized, the error code cannot be translated by cuGetErrorString any more.
So now only print errStr with CUDA_SUCCESS.
Treat CUDA_ERROR_INVALID_VALUE different from generic !=CUDA_SUCCESS

ye-luo updated this revision to Diff 293207.Sep 21 2020, 9:47 AM

Should be good to go now.

jhuber6 accepted this revision.Sep 21 2020, 9:49 AM

Should be a good solution for now.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSep 21 2020, 10:43 AM