This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Fix support for device as host
ClosedPublic

Authored by jdenny on Feb 27 2021, 10:33 AM.

Details

Summary

Without this patch, when the offload device is set to
omp_get_initial_device(), the runtime fails with an error diagnostic
when entering target regions or target data regions.

However, OpenMP 5.1, sec. 2.14.5 "target Construct", "Restrictions",
p. 203, L3-5 states:

The device clause expression must evaluate to a non-negative integer
value that is less than or equal to the value of
omp_get_num_devices().

Sec. 3.7.7 "omp_get_initial_device", p. 412, L2-3 states:

The value of the device number is the value returned by the
omp_get_num_devices routine.

Similarly, OpenMP 5.0, sec. 2.12.5 "target Construct", "Restrictions",
p. 174 L30-32 states:

The device clause expression must evaluate to a non-negative integer
value less than the value of omp_get_num_devices() or to the value
of omp_get_initial_device().

This patch fixes this behavior by changing the runtime to behave as if
offloading is disabled whenever it finds the offload device (either
from a device clause or the default device) is set to the host
device. In the case of mandatory offloading when
omp_get_num_devices() == 0, it incorporates the behavior proposed
for OpenMP 5.2 in OpenMP spec github issue 2669.

Diff Detail

Event Timeline

jdenny created this revision.Feb 27 2021, 10:33 AM
jdenny requested review of this revision.Feb 27 2021, 10:33 AM
grokos added a comment.EditedMar 1 2021, 9:16 AM

This is a correct observation. The host device is not a target device, so there is no offloading available.

However, in all __tgt_target_* functions we call CheckDeviceAndCtors() which in turn calls device_is_ready(). The latter checks whether the device ID is greater than or equal to the number of available devices. If it is, then it returns false to CheckDeviceAndCtors() which ultimately returns OFFLOAD_FAIL to the caller __tgt_target_* function. Why doesn't the existing implementation work? What extra checks is this patch adding?

jdenny added a comment.Mar 1 2021, 9:42 AM

In every case where this patch inserts a check, there is code immediately following it that calls HandleTargetOutcome(false, loc);, usually because CheckDeviceAndCtors(device_id) != OFFLOAD_SUCCESS. The inserted check prevents that from code from executing. Is that what you're asking?

grokos accepted this revision.Mar 1 2021, 10:01 AM

OK, clear. You are trying to avoid the call to HandleTargetOutcome(false,...) which erroneously says that offloading has failed. LGTM.

This revision is now accepted and ready to land.Mar 1 2021, 10:01 AM

If OMP_TARGET_OFFLOAD=mandatory it will not fail but just return

jdenny added a comment.Mar 1 2021, 2:01 PM

If OMP_TARGET_OFFLOAD=mandatory it will not fail but just return

Are you suggesting that's incorrect behavior when offloading to host?

OpenMP 5.1 sec. 6.17 "OMP_TARGET_OFFLOAD" says:

The mandatory value specifies that program execution is terminated if a device construct or
device memory routine is encountered and the device is not available or is not supported by the
implementation.

I assumed that host is always considered available and supported.

This is where we are heading towards in OpenMP spec
The mandatory value specifies that program execution is terminated if a device construct or
device memory routine is encountered and the device is not available or is not supported by the
implementation. or if omp_get_num_devices() would return 0.

jdenny added a comment.Mar 1 2021, 3:04 PM

Thanks for pointing that out (OpenMP spec issue 2669). I wasn't aware it was changing. I'll wait until that settles.

jdenny updated this revision to Diff 327441.Mar 2 2021, 6:58 AM

In anticipation of the spec change, I've updated the patch to check the omp_get_num_devices() == 0 case. A few questions:

  • Does it match your understanding of what's being proposed? I realize it's possible that will change, so I'm not suggesting we push this yet.
  • Is it compatible with OpenMP 5.1 or 5.0? Should we skip the check in those cases?
  • The modified functions all have a common prologue. In a separate patch, I think that most of that should be moved to CheckDeviceAndCtors. Agreed?
jdenny added a comment.Mar 2 2021, 7:00 AM

One more question: How do I test the new check? For cuda offload, I can set CUDA_VISIBLE_DEVICES to an empty string, but I haven't found a general way.

grokos added a comment.Mar 3 2021, 9:02 AM
  • The modified functions all have a common prologue. In a separate patch, I think that most of that should be moved to CheckDeviceAndCtors. Agreed?

Sure!

One more question: How do I test the new check? For cuda offload, I can set CUDA_VISIBLE_DEVICES to an empty string, but I haven't found a general way.

I don't think there is a general way to do this. An ugly hack (just for testing purposes) would be to move all plugin libraries (i.e. libomptarget.rtl.x86_64.so, libomptarget.rtl.cuda.so etc) to a directory where libomptarget cannot find them, then it will appear as if no supported devices are available in the system.

jdenny updated this revision to Diff 327889.Mar 3 2021, 12:39 PM
jdenny edited the summary of this revision. (Show Details)
  • Rebased.
  • Added test for mandatory offload with omp_get_num_devices() == 0.
  • Added comments pointing to OpenMP spec issue 2669.
  • The modified functions all have a common prologue. In a separate patch, I think that most of that should be moved to CheckDeviceAndCtors. Agreed?

Sure!

Will do after this one.

One more question: How do I test the new check? For cuda offload, I can set CUDA_VISIBLE_DEVICES to an empty string, but I haven't found a general way.

I don't think there is a general way to do this. An ugly hack (just for testing purposes) would be to move all plugin libraries (i.e. libomptarget.rtl.x86_64.so, libomptarget.rtl.cuda.so etc) to a directory where libomptarget cannot find them, then it will appear as if no supported devices are available in the system.

Good to know for manual testing. For the test suite, I've added testing just for cuda.

Thanks.

This revision was landed with ongoing or failed builds.Mar 4 2021, 9:13 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2021, 9:13 AM