This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CGOpenMPRuntime.cpp
-
test/OpenMP/
-
OpenMP/
-
align_clause_codegen.cpp
-
atomic_compare_codegen.cpp
-
bug60602.cpp
-
distribute_codegen.cpp
-
distribute_firstprivate_codegen.cpp
-
distribute_lastprivate_codegen.cpp
-
distribute_parallel_for_codegen.cpp
-
distribute_parallel_for_firstprivate_codegen.cpp
-
distribute_parallel_for_if_codegen.cpp
-
distribute_parallel_for_lastprivate_codegen.cpp
-
distribute_parallel_for_num_threads_codegen.cpp
-
distribute_parallel_for_private_codegen.cpp
-
distribute_parallel_for_proc_bind_codegen.cpp
-
distribute_parallel_for_simd_codegen.cpp
-
distribute_parallel_for_simd_firstprivate_codegen.cpp
-
distribute_parallel_for_simd_if_codegen.cpp
-
distribute_parallel_for_simd_lastprivate_codegen.cpp
-
distribute_parallel_for_simd_num_threads_codegen.cpp
-
distribute_parallel_for_simd_private_codegen.cpp
-
distribute_parallel_for_simd_proc_bind_codegen.cpp
-
distribute_private_codegen.cpp
-
distribute_simd_codegen.cpp
-
distribute_simd_firstprivate_codegen.cpp
-
distribute_simd_lastprivate_codegen.cpp
-
distribute_simd_private_codegen.cpp
-
distribute_simd_reduction_codegen.cpp
-
for_non_rectangular_codegen.c
-
irbuilder_nested_openmp_parallel_empty.c
-
nested_loop_codegen.cpp
-
nvptx_lambda_capturing.cpp
-
nvptx_target_parallel_proc_bind_codegen.cpp
-
nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
-
reduction_compound_op.cpp
-
reduction_implicit_map.cpp
-
target_codegen_global_capture.cpp
-
target_has_device_addr_codegen.cpp
-
target_has_device_addr_codegen_01.cpp
-
target_map_codegen_03.cpp
2/2
target_map_codegen_hold.cpp
-
target_map_deref_array_codegen.cpp
-
target_map_member_expr_codegen.cpp
-
target_offload_mandatory_codegen.cpp
-
target_ompx_dyn_cgroup_mem_codegen.cpp
-
target_parallel_codegen.cpp
-
target_parallel_for_codegen.cpp
-
target_parallel_for_simd_codegen.cpp
-
target_parallel_if_codegen.cpp
-
target_parallel_num_threads_codegen.cpp
-
target_task_affinity_codegen.cpp
-
target_teams_codegen.cpp
-
target_teams_distribute_codegen.cpp
-
target_teams_distribute_collapse_codegen.cpp
-
target_teams_distribute_dist_schedule_codegen.cpp
-
target_teams_distribute_firstprivate_codegen.cpp
-
target_teams_distribute_lastprivate_codegen.cpp
-
target_teams_distribute_parallel_for_codegen.cpp
-
target_teams_distribute_parallel_for_collapse_codegen.cpp
-
target_teams_distribute_parallel_for_dist_schedule_codegen.cpp
-
target_teams_distribute_parallel_for_firstprivate_codegen.cpp
-
target_teams_distribute_parallel_for_if_codegen.cpp
-
target_teams_distribute_parallel_for_lastprivate_codegen.cpp
-
target_teams_distribute_parallel_for_private_codegen.cpp
-
target_teams_distribute_parallel_for_proc_bind_codegen.cpp
-
target_teams_distribute_parallel_for_reduction_codegen.cpp
-
target_teams_distribute_parallel_for_schedule_codegen.cpp
-
target_teams_distribute_parallel_for_simd_codegen.cpp
-
target_teams_distribute_parallel_for_simd_collapse_codegen.cpp
-
target_teams_distribute_parallel_for_simd_dist_schedule_codegen.cpp
-
target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp
-
target_teams_distribute_parallel_for_simd_if_codegen.cpp
-
target_teams_distribute_parallel_for_simd_lastprivate_codegen.cpp
-
target_teams_distribute_parallel_for_simd_private_codegen.cpp
-
target_teams_distribute_parallel_for_simd_proc_bind_codegen.cpp
-
target_teams_distribute_parallel_for_simd_reduction_codegen.cpp
-
target_teams_distribute_parallel_for_simd_schedule_codegen.cpp
-
target_teams_distribute_private_codegen.cpp
-
target_teams_distribute_reduction_codegen.cpp
-
target_teams_distribute_simd_codegen.cpp
-
target_teams_distribute_simd_collapse_codegen.cpp
-
target_teams_distribute_simd_dist_schedule_codegen.cpp
-
target_teams_distribute_simd_firstprivate_codegen.cpp
-
target_teams_distribute_simd_lastprivate_codegen.cpp
-
target_teams_distribute_simd_private_codegen.cpp
-
target_teams_distribute_simd_reduction_codegen.cpp
-
target_teams_map_codegen.cpp
-
target_teams_num_teams_codegen.cpp
-
target_teams_thread_limit_codegen.cpp
-
teams_codegen.cpp
-
teams_distribute_codegen.cpp
-
teams_distribute_collapse_codegen.cpp
-
teams_distribute_dist_schedule_codegen.cpp
-
teams_distribute_firstprivate_codegen.cpp
-
teams_distribute_lastprivate_codegen.cpp
-
teams_distribute_parallel_for_codegen.cpp
-
teams_distribute_parallel_for_collapse_codegen.cpp
-
teams_distribute_parallel_for_copyin_codegen.cpp
-
teams_distribute_parallel_for_dist_schedule_codegen.cpp
-
teams_distribute_parallel_for_firstprivate_codegen.cpp
-
teams_distribute_parallel_for_if_codegen.cpp
-
teams_distribute_parallel_for_lastprivate_codegen.cpp
-
teams_distribute_parallel_for_num_threads_codegen.cpp
-
teams_distribute_parallel_for_private_codegen.cpp
-
teams_distribute_parallel_for_proc_bind_codegen.cpp
-
teams_distribute_parallel_for_reduction_codegen.cpp
-
teams_distribute_parallel_for_schedule_codegen.cpp
-
teams_distribute_parallel_for_simd_codegen.cpp
-
teams_distribute_parallel_for_simd_collapse_codegen.cpp
-
teams_distribute_parallel_for_simd_dist_schedule_codegen.cpp
-
teams_distribute_parallel_for_simd_firstprivate_codegen.cpp
-
teams_distribute_parallel_for_simd_if_codegen.cpp
-
teams_distribute_parallel_for_simd_lastprivate_codegen.cpp
-
teams_distribute_parallel_for_simd_num_threads_codegen.cpp
-
teams_distribute_parallel_for_simd_private_codegen.cpp
-
teams_distribute_parallel_for_simd_proc_bind_codegen.cpp
-
teams_distribute_parallel_for_simd_reduction_codegen.cpp
-
teams_distribute_parallel_for_simd_schedule_codegen.cpp
-
teams_distribute_private_codegen.cpp
-
teams_distribute_reduction_codegen.cpp
-
teams_distribute_simd_codegen.cpp
-
teams_distribute_simd_collapse_codegen.cpp
-
teams_distribute_simd_dist_schedule_codegen.cpp
-
teams_distribute_simd_firstprivate_codegen.cpp
-
teams_distribute_simd_lastprivate_codegen.cpp
-
teams_distribute_simd_private_codegen.cpp
-
teams_distribute_simd_reduction_codegen.cpp
-
teams_firstprivate_codegen.cpp
-
teams_private_codegen.cpp
-
llvm/
-
include/llvm/Frontend/OpenMP/
-
llvm/
-
Frontend/
-
OpenMP/
-
OMPIRBuilder.h
-
lib/Frontend/OpenMP/
-
Frontend/
-
OpenMP/
-
OMPIRBuilder.cpp

Differential D145820

[Clang][OpenMP] Insert alloca for kernel args at function entry block instead of the launch point.
ClosedPublic

Authored by dhruvachak on Mar 10 2023, 11:44 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
doru1004
carlo.bertolli
jhuber6
tianshilei1992
ABataev
jplehr
ronlieb
RaviNarayanaswamy

Commits

rG1c9ec74e3f2a: [Clang][OpenMP] Insert alloca for kernel args at function entry block instead…

Summary

If an inlined kernel is called in a loop, the launch point alloca would
lead to increasing stack usage every time the kernel is invoked. This
could make the application run out of stack space and crash. This problem
is fixed by using the alloca insertion point while creating the alloca instruction.

Fixes https://github.com/llvm/llvm-project/issues/60602

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dhruvachak created this revision.Mar 10 2023, 11:44 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2023, 11:44 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

dhruvachak requested review of this revision.Mar 10 2023, 11:44 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptMar 10 2023, 11:44 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, cfe-commits, sstefan1. · View Herald Transcript

TODOs: Update existing LIT tests, add a new one.

dhruvachak added reviewers: doru1004, carlo.bertolli, jhuber6, tianshilei1992, ABataev, jplehr, ronlieb.Mar 10 2023, 11:48 AM

No tests updated?

In D145820#4185695, @jhuber6 wrote:

No tests updated?

That's a TODO, coming soon.

dhruvachak retitled this revision from Insert alloca for kernel args at function entry block instead of the launch point. to [Clang][OpenMP] Insert alloca for kernel args at function entry block instead of the launch point..Mar 10 2023, 11:54 AM

Herald added subscribers: sunshaoce, guansong, yaxunl. · View Herald TranscriptMar 10 2023, 11:54 AM

dhruvachak added a reviewer: RaviNarayanaswamy.Mar 10 2023, 12:09 PM

Check the other functions around, they take an AllocaIP, which is on the user side not necessarily the entry block. We need to follow that scheme here too.

Harbormaster completed remote builds in B218736: Diff 504234.Mar 10 2023, 2:43 PM

Addressed comment. Using the alloca insert point for kernel args alloca.

Herald added subscribers: mattd, asavonic. · View Herald TranscriptMar 12 2023, 11:59 PM

dhruvachak edited the summary of this revision. (Show Details)Mar 13 2023, 12:04 AM

Harbormaster completed remote builds in B218941: Diff 504515.Mar 13 2023, 12:41 AM

Fixed LIT test failure, added new clang test OpenMP/bug60602.cpp.

Harbormaster completed remote builds in B219248: Diff 504945.Mar 13 2023, 11:17 PM

LG, make sure all tests pass, this one seems to fail according to build kite: OpenMP/target_map_codegen_hold.cpp

clang/test/OpenMP/target_map_codegen_hold.cpp
211	The hash is hardcoded. I recommend removing this part and putting the old check lines in again.

This revision is now accepted and ready to land.Mar 14 2023, 2:21 PM

Fixed clang test OpenMP/target_map_codegen_hold.cpp.

dhruvachak marked an inline comment as done.Mar 16 2023, 11:22 PM

dhruvachak added inline comments.

clang/test/OpenMP/target_map_codegen_hold.cpp
211	@jdoerfert Thanks for the suggestion. It was passing for me locally, I assume the locally built compiler was generating the same hash. I removed the hash lines like it was earlier, it should pass now.