This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CGOpenMPRuntime.h
12/12
CGOpenMPRuntime.cpp
-
test/OpenMP/
-
OpenMP/
-
bug60602.cpp
-
distribute_codegen.cpp
-
distribute_firstprivate_codegen.cpp
-
distribute_lastprivate_codegen.cpp
-
distribute_parallel_for_codegen.cpp
-
distribute_parallel_for_firstprivate_codegen.cpp
-
distribute_parallel_for_if_codegen.cpp
-
distribute_parallel_for_lastprivate_codegen.cpp
-
distribute_parallel_for_num_threads_codegen.cpp
-
distribute_parallel_for_private_codegen.cpp
-
distribute_parallel_for_proc_bind_codegen.cpp
-
distribute_parallel_for_simd_codegen.cpp
-
distribute_parallel_for_simd_firstprivate_codegen.cpp
-
distribute_parallel_for_simd_if_codegen.cpp
-
distribute_parallel_for_simd_lastprivate_codegen.cpp
-
distribute_parallel_for_simd_num_threads_codegen.cpp
-
distribute_parallel_for_simd_private_codegen.cpp
-
distribute_parallel_for_simd_proc_bind_codegen.cpp
-
distribute_private_codegen.cpp
-
distribute_simd_codegen.cpp
-
distribute_simd_firstprivate_codegen.cpp
-
distribute_simd_lastprivate_codegen.cpp
-
distribute_simd_private_codegen.cpp
-
distribute_simd_reduction_codegen.cpp
-
nvptx_SPMD_codegen.cpp
-
nvptx_lambda_capturing.cpp
-
nvptx_target_parallel_num_threads_codegen.cpp
-
reduction_implicit_map.cpp
-
target_codegen_global_capture.cpp
-
target_map_member_expr_codegen.cpp
-
target_ompx_dyn_cgroup_mem_codegen.cpp
-
target_parallel_num_threads_codegen.cpp
-
target_teams_codegen.cpp
-
target_teams_distribute_codegen.cpp
-
target_teams_distribute_collapse_codegen.cpp
-
target_teams_distribute_dist_schedule_codegen.cpp
-
target_teams_distribute_firstprivate_codegen.cpp
-
target_teams_distribute_lastprivate_codegen.cpp
-
target_teams_distribute_parallel_for_codegen.cpp
-
target_teams_distribute_parallel_for_collapse_codegen.cpp
-
target_teams_distribute_parallel_for_dist_schedule_codegen.cpp
-
target_teams_distribute_parallel_for_firstprivate_codegen.cpp
-
target_teams_distribute_parallel_for_if_codegen.cpp
-
target_teams_distribute_parallel_for_lastprivate_codegen.cpp
-
target_teams_distribute_parallel_for_order_codegen.cpp
-
target_teams_distribute_parallel_for_private_codegen.cpp
-
target_teams_distribute_parallel_for_proc_bind_codegen.cpp
-
target_teams_distribute_parallel_for_reduction_codegen.cpp
-
target_teams_distribute_parallel_for_schedule_codegen.cpp
-
target_teams_distribute_parallel_for_simd_codegen.cpp
-
target_teams_distribute_parallel_for_simd_collapse_codegen.cpp
-
target_teams_distribute_parallel_for_simd_dist_schedule_codegen.cpp
-
target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp
-
target_teams_distribute_parallel_for_simd_if_codegen.cpp
-
target_teams_distribute_parallel_for_simd_lastprivate_codegen.cpp
-
target_teams_distribute_parallel_for_simd_private_codegen.cpp
-
target_teams_distribute_parallel_for_simd_proc_bind_codegen.cpp
-
target_teams_distribute_parallel_for_simd_reduction_codegen.cpp
-
target_teams_distribute_parallel_for_simd_schedule_codegen.cpp
-
target_teams_distribute_private_codegen.cpp
-
target_teams_distribute_reduction_codegen.cpp
-
target_teams_generic_loop_codegen-1.cpp
-
target_teams_generic_loop_collapse_codegen.cpp
-
target_teams_generic_loop_depend_codegen.cpp
-
target_teams_generic_loop_if_codegen.cpp
-
target_teams_generic_loop_order_codegen.cpp
-
target_teams_generic_loop_private_codegen.cpp
-
target_teams_generic_loop_reduction_codegen.cpp
-
target_teams_generic_loop_uses_allocators_codegen.cpp
-
target_teams_map_codegen.cpp
-
target_teams_num_teams_codegen.cpp
-
teams_codegen.cpp
-
teams_distribute_codegen.cpp
-
teams_distribute_collapse_codegen.cpp
-
teams_distribute_dist_schedule_codegen.cpp
-
teams_distribute_firstprivate_codegen.cpp
-
teams_distribute_lastprivate_codegen.cpp
-
teams_distribute_parallel_for_codegen.cpp
-
teams_distribute_parallel_for_collapse_codegen.cpp
-
teams_distribute_parallel_for_copyin_codegen.cpp
-
teams_distribute_parallel_for_dist_schedule_codegen.cpp
-
teams_distribute_parallel_for_firstprivate_codegen.cpp
-
teams_distribute_parallel_for_if_codegen.cpp
-
teams_distribute_parallel_for_lastprivate_codegen.cpp
-
teams_distribute_parallel_for_num_threads_codegen.cpp
-
teams_distribute_parallel_for_private_codegen.cpp
-
teams_distribute_parallel_for_proc_bind_codegen.cpp
-
teams_distribute_parallel_for_reduction_codegen.cpp
-
teams_distribute_parallel_for_schedule_codegen.cpp
-
teams_distribute_parallel_for_simd_codegen.cpp
-
teams_distribute_parallel_for_simd_collapse_codegen.cpp
-
teams_distribute_parallel_for_simd_dist_schedule_codegen.cpp
-
teams_distribute_parallel_for_simd_firstprivate_codegen.cpp
-
teams_distribute_parallel_for_simd_if_codegen.cpp
-
teams_distribute_parallel_for_simd_lastprivate_codegen.cpp
-
teams_distribute_parallel_for_simd_num_threads_codegen.cpp
-
teams_distribute_parallel_for_simd_private_codegen.cpp
-
teams_distribute_parallel_for_simd_proc_bind_codegen.cpp
-
teams_distribute_parallel_for_simd_reduction_codegen.cpp
-
teams_distribute_parallel_for_simd_schedule_codegen.cpp
-
teams_distribute_private_codegen.cpp
-
teams_distribute_reduction_codegen.cpp
-
teams_distribute_simd_codegen.cpp
-
teams_distribute_simd_collapse_codegen.cpp
-
teams_distribute_simd_dist_schedule_codegen.cpp
-
teams_distribute_simd_firstprivate_codegen.cpp
-
teams_distribute_simd_lastprivate_codegen.cpp
-
teams_distribute_simd_private_codegen.cpp
-
teams_distribute_simd_reduction_codegen.cpp
-
teams_firstprivate_codegen.cpp
-
teams_generic_loop_codegen-1.cpp
-
teams_generic_loop_collapse_codegen.cpp
-
teams_generic_loop_private_codegen.cpp
-
teams_generic_loop_reduction_codegen.cpp
-
teams_private_codegen.cpp
-
openmp/libomptarget/test/offloading/
-
libomptarget/
-
test/
-
offloading/
-
thread_limit.c

Differential D158381

[OpenMP] Properly set static thread limit (w/o analysis)
ClosedPublic

Authored by jdoerfert on Aug 20 2023, 8:03 PM.

Download Raw Diff

Details

Reviewers

ABataev
tianshilei1992
JonChesterfield
ye-luo
jhuber6
josemonsalve2

Commits

rGc5488c8dcc8c: [OpenMP] Properly set static thread limit (w/o analysis)

Summary

We used to have two separate implementations to derive the number of
threads used in a target region. This lead us to sometimes miss out on
user provided thread bounds (num_threads, or thread_limit) when we
looked for "constant default values". If we might miss out on the
presence of those bounds, we cannot set the thread_limit statically
since the runtime will try to honor user input rather than cap it at the
"preferred default". This patch replaces the secondary implementation
with the primary in a mode that will not emit code but just look for the
presence, and potentially upper bounds, of thread limiting clauses.

The runtime test would not pass without this rewrite as we missed some
clauses, set the static limit on the device to the preferred value, but
then violated that value at runtime.

Fixes: https://github.com/llvm/llvm-project/issues/64845

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jdoerfert created this revision.Aug 20 2023, 8:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 20 2023, 8:03 PM

Herald added subscribers: mattd, asavonic, guansong and 2 others. · View Herald Transcript

jdoerfert requested review of this revision.Aug 20 2023, 8:03 PM

Herald added subscribers: jplehr, sstefan1. · View Herald TranscriptAug 20 2023, 8:03 PM

Harbormaster completed remote builds in B253753: Diff 551882.Aug 20 2023, 8:04 PM

jdoerfert added a child revision: D158382: [OpenMP] Use default grid value for static grid size.Aug 20 2023, 8:06 PM

A few nits. Lots of code but it's mostly just shuffling around stuff that exists. Probably fine given the tests pass.

clang/lib/CodeGen/CGOpenMPRuntime.cpp
6007	Why is this signed but the threads are unsigned now?
6011	The LLVM style uses `/Value=/` for inline comments https://llvm.org/docs/CodingStandards.html#comment-formatting
6326	Favor C++ style casts?
6329	Should probably be `UINT32_MAX`
6377	Ditto

This revision is now accepted and ready to land.Aug 21 2023, 11:49 AM

tianshilei1992 added inline comments.Aug 21 2023, 11:56 AM

clang/lib/CodeGen/CGOpenMPRuntime.cpp
6008	to have an unsigned value assigned with -1 is really a bad idea

jhuber6 added inline comments.Aug 21 2023, 11:58 AM

clang/lib/CodeGen/CGOpenMPRuntime.cpp
6008	`-1` is well defined for unsigned as `0xffffffff`, but it's far clearer to use `UINT32_MAX` like I suggested.

tianshilei1992 added inline comments.Aug 21 2023, 11:58 AM

clang/lib/CodeGen/CGOpenMPRuntime.cpp
6008	even `~0U` is much better

jdoerfert added inline comments.Aug 21 2023, 12:58 PM

clang/lib/CodeGen/CGOpenMPRuntime.cpp
6007	I did not touch the team stuff yet. Same rewrite is necessary there, but not required right now.
6008	I can do `~0U` but the point is really that we look for the -1 later on. I don't do arithmetic, just comparisons.
6326	like `reinterpret_cast<uint32_t>(...)`? sure.
6329	I figured this way it's more obvious.