This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][deviceRTLs] Fix wrong return value of `__kmpc_is_spmd_exec_mode`
ClosedPublic

Authored by tianshilei1992 on Oct 15 2021, 12:33 PM.

Details

Summary

D110279 introduced a bug to the device runtime. In __kmpc_parallel_51, we detect
whether we are already in parallel region by __kmpc_parallel_level() > __kmpc_is_spmd_exec_mode().
It is based on the assumption that:

  • In SPMD mode, parallel level is initialized to 1.
  • In generic mode, parallel level is initialized to 0.
  • __kmpc_is_spmd_exec_mode returns 1 for SPMD mode, 0 otherwise.

Because the return value type of __kmpc_is_spmd_exec_mode is int8_t, there
was an implicit cast from bool to int8_t. We can make sure it is either 0 or
1 since C++14. In D110279, the return value is the result of an and operation,
which is 2 in SPMD mode. This breaks the assumption in __kmpc_parallel_51.

Diff Detail

Event Timeline

tianshilei1992 requested review of this revision.Oct 15 2021, 12:33 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 15 2021, 12:33 PM
carlo.bertolli accepted this revision.Oct 15 2021, 1:02 PM

LGTM.

Thanks for fixing this. It would be nice if you could add a comment where it is used at the beginning of parallel_51, stating what are the initial value of parallelLevel (kmpc_parallel_level) and those of kmpc_is_spmd_exec_mode in the two cases of generic and spmd modes (assuming generic spmd mode is handled as a spmd). Something like:

  • In SPMD mode, parallelLevel (queried by kmpc_parallel_level()) is set to 1 at kernel initialization. kmpc_is_spmd_exec_mode returns 1: we will need to serialize a nested parallel region only when we increment parallelLevel by one (value will be 2) when encountering the nested region.
  • In generic mode, parallelLevel is set to 0 at kernel initialization. kmpc_Is_spmd_exec_mode returns 0 in this case: we will need to serialized a nested parallel region only when we increment parallelLevel by one (value will be 1) when encountering the nested region.
This revision is now accepted and ready to land.Oct 15 2021, 1:02 PM
dpalermo accepted this revision.Oct 15 2021, 3:16 PM
dpalermo added a subscriber: dpalermo.

I applied this change to see if it fixed the following (since this routine is called when deciding to serialize nested parallel regions):

https://bugs.llvm.org/show_bug.cgi?id=52169
Bug 52169 - [OpenMP] Wrong answers with nested 'omp parallel for'

...but I am still seeing the same behavior/wrong answers.

This patch looks good to me. There is just something else remaining that is still causing a problem with nested parallel.

Are they still broken before D110279?

update comment