This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
1/2
CGCall.cpp
-
test/
-
CodeGen/PowerPC/
-
PowerPC/
-
ppc64le-varargs-f128.c
-
CodeGenCUDA/
-
address-spaces.cu
-
builtins-amdgcn.cu
-
cuda-builtin-vars.cu
-
kernel-args-alignment.cu
-
kernel-args.cu
-
lambda.cu
-
redux-builtins.cu
-
surface.cu
-
unnamed-types.cu
-
usual-deallocators.cu
-
vtbl.cu
-
CodeGenCUDASPIRV/
-
kernel-argument.cu
-
CodeGenHIP/
-
hipspv-addr-spaces.cpp
-
CodeGenOpenCL/
-
addr-space-struct-arg.cl
-
address-spaces.cl
-
amdgcn-automatic-variable.cl
-
amdgpu-abi-struct-coerce.cl
-
amdgpu-call-kernel.cl
-
amdgpu-printf.cl
-
as_type.cl
-
atomic-ops-libcall.cl
-
blocks.cl
-
byval.cl
-
const-str-array-decay.cl
-
constant-addr-space-globals.cl
-
convergent.cl
-
fpmath.cl
-
half.cl
-
kernel-param-alignment.cl
-
kernels-have-spir-cc-by-default.cl
-
no-half.cl
-
overload.cl
-
size_t.cl
-
spir-calling-conv.cl
-
CodeGenOpenCLCXX/
-
address-space-deduction.clcpp
-
addrspace-derived-base.clcpp
-
addrspace-new-delete.clcpp
-
addrspace-of-this.clcpp
-
addrspace-operators.clcpp
-
addrspace-references.clcpp
-
addrspace-with-class.clcpp
-
template-address-spaces.clcpp
-
CodeGenSYCL/
-
address-space-conversions.cpp
-
address-space-mangling.cpp
-
functionptr-addrspace.cpp
-
unique_stable_name.cpp
-
OpenMP/
-
amdgcn-attributes.cpp
-
amdgcn_target_global_constructor.cpp
-
assumes_include_nvptx.cpp
-
declare_target_codegen.cpp
-
declare_target_codegen_globalization.cpp
-
declare_target_link_codegen.cpp
-
declare_variant_mixed_codegen.c
-
distribute_codegen.cpp
-
distribute_simd_codegen.cpp
-
nvptx_allocate_codegen.cpp
-
nvptx_data_sharing.cpp
-
nvptx_declare_target_var_ctor_dtor_codegen.cpp
-
nvptx_declare_variant_name_mangling.cpp
-
nvptx_distribute_parallel_generic_mode_codegen.cpp
-
nvptx_lambda_capturing.cpp
-
nvptx_multi_target_parallel_codegen.cpp
-
nvptx_nested_parallel_codegen.cpp
-
nvptx_parallel_codegen.cpp
-
nvptx_parallel_for_codegen.cpp
-
nvptx_target_firstprivate_codegen.cpp
-
nvptx_target_parallel_codegen.cpp
-
nvptx_target_parallel_num_threads_codegen.cpp
-
nvptx_target_parallel_reduction_codegen.cpp
-
nvptx_target_printf_codegen.c
-
nvptx_target_teams_codegen.cpp
-
nvptx_target_teams_distribute_codegen.cpp
-
nvptx_target_teams_distribute_parallel_for_codegen.cpp
-
nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
-
nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
-
nvptx_teams_codegen.cpp
-
nvptx_teams_reduction_codegen.cpp
-
nvptx_unsupported_type_codegen.cpp
-
openmp_offload_codegen.cpp
-
reduction_implicit_map.cpp
-
target_firstprivate_codegen.cpp
-
target_parallel_codegen.cpp
-
target_parallel_debug_codegen.cpp
-
target_parallel_for_codegen.cpp
-
target_parallel_for_debug_codegen.cpp
-
target_parallel_for_simd_codegen.cpp
-
target_parallel_if_codegen.cpp
-
target_parallel_num_threads_codegen.cpp
-
target_private_codegen.cpp
-
target_reduction_codegen.cpp
-
target_teams_codegen.cpp
-
target_teams_distribute_codegen.cpp
-
target_teams_distribute_parallel_for_codegen.cpp
-
target_teams_distribute_parallel_for_firstprivate_codegen.cpp
-
target_teams_distribute_parallel_for_private_codegen.cpp
-
target_teams_distribute_parallel_for_simd_codegen.cpp
-
target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp
-
target_teams_distribute_parallel_for_simd_private_codegen.cpp
-
target_teams_distribute_simd_codegen.cpp
-
target_teams_map_codegen.cpp
-
target_teams_num_teams_codegen.cpp
-
target_teams_thread_limit_codegen.cpp
-
teams_codegen.cpp
-
llvm/test/Transforms/SimplifyCFG/
-
test/
-
Transforms/
-
SimplifyCFG/
1/3
tautological-conditional-branch-convergent-noundef.ll

Differential D124158

[Clang][Attr] Skip adding noundef attribute to arguments when function has convergent attribute
AbandonedPublic

Authored by skc7 on Apr 21 2022, 2:54 AM.

Download Raw Diff

Details

Reviewers

sameerds
cdevadas
ronlieb
jdoerfert
rjmccall
nhaehnle
arsenm

Summary

Change https://reviews.llvm.org/D105169 enables noundef attribute by default. This is causing issue with functions tagged with convergent attribute.

For Ex: SimplifyCFG pass removes the branch leading to a BB which has an incoming value that will always trigger undefined behavior. This basically modifies the CFG and combines the basic blocks. This works for CPU execution. But on a GPU, there are intrinsics like "__shfl_sync(unsigned mask, T var, int srcLane, int width=warpSize)", Where the exchange of variable occurs simultaneously for all active threads within the warp. So, here in the cuda/hip kernel, variable var in shuffl_sync may not be initialised, and LLVM IR treats it as undef. Currently all the arguments are tagged with noundef attribute and the above mentioned optimization by SimplifyCFG gets applied and the kernel execution becomes ambiguous. So, the proposed change is to skip adding noundef attribute to arguments when a function has been tagged with convergent attribute.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60,020 ms	x64 debian > libFuzzer.libFuzzer::large.test

Event Timeline

skc7 created this revision.Apr 21 2022, 2:54 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2022, 2:54 AM

Herald added subscribers: mattd, asavonic, ThomasRaoux and 5 others. · View Herald Transcript

skc7 requested review of this revision.Apr 21 2022, 2:54 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptApr 21 2022, 2:54 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, cfe-commits, sstefan1. · View Herald Transcript

Harbormaster completed remote builds in B160613: Diff 424134.Apr 21 2022, 3:52 AM

skc7 added a reviewer: arsenm.Apr 26 2022, 7:42 AM

Herald added a subscriber: wdng. · View Herald TranscriptApr 26 2022, 7:42 AM

arsenm added inline comments.Apr 26 2022, 9:41 AM

clang/lib/CodeGen/CGCall.cpp
2438–2440	Missing space before (. Needs comment explaining this
llvm/test/Transforms/SimplifyCFG/tautological-conditional-branch-convergent-noundef.ll
27	Aren't the cases with defined handling of undef lanes still defined for the result?

clang-format to CGCall.cpp. Added comment for the change

Harbormaster completed remote builds in B161516: Diff 425393.Apr 26 2022, 7:08 PM

update

skc7 added inline comments.Apr 26 2022, 7:30 PM

clang/lib/CodeGen/CGCall.cpp
2438–2440	Updated
llvm/test/Transforms/SimplifyCFG/tautological-conditional-branch-convergent-noundef.ll
27	ret double %i4?

Harbormaster completed remote builds in B161517: Diff 425394.Apr 26 2022, 8:13 PM

fix failing tests

Harbormaster completed remote builds in B161550: Diff 425442.Apr 27 2022, 2:57 AM

skip adding noundef to return type

update test

skc7 marked an inline comment as not done.Apr 27 2022, 3:19 AM

skc7 added inline comments.

llvm/test/Transforms/SimplifyCFG/tautological-conditional-branch-convergent-noundef.ll
27	Updated patch to skip adding noundef attribute to return types aswell

skc7 added a reviewer: rjmccall.Apr 27 2022, 3:22 AM

Harbormaster completed remote builds in B161566: Diff 425463.Apr 27 2022, 3:50 AM

For Ex: SimplifyCFG pass removes the branch leading to a BB which has an incoming value that will always trigger undefined behavior. This basically modifies the CFG and combines the basic blocks. This works for CPU execution. But on a GPU, there are intrinsics like "__shfl_sync(unsigned mask, T var, int srcLane, int width=warpSize)", Where the exchange of variable occurs simultaneously for all active threads within the warp. So, here in the cuda/hip kernel, variable var in shuffl_sync may not be initialised, and LLVM IR treats it as undef. Currently all the arguments are tagged with noundef attribute and the above mentioned optimization by SimplifyCFG gets applied and the kernel execution becomes ambiguous. So, the proposed change is to skip adding noundef attribute to arguments when a function has been tagged with convergent attribute.

Can we please have an example for this. I don't know what would be broken w/ noundef + convergent and I somewhat doubt noundef is the problem.

In D124158#3477281, @jdoerfert wrote:

For Ex: SimplifyCFG pass removes the branch leading to a BB which has an incoming value that will always trigger undefined behavior. This basically modifies the CFG and combines the basic blocks. This works for CPU execution. But on a GPU, there are intrinsics like "__shfl_sync(unsigned mask, T var, int srcLane, int width=warpSize)", Where the exchange of variable occurs simultaneously for all active threads within the warp. So, here in the cuda/hip kernel, variable var in shuffl_sync may not be initialised, and LLVM IR treats it as undef. Currently all the arguments are tagged with noundef attribute and the above mentioned optimization by SimplifyCFG gets applied and the kernel execution becomes ambiguous. So, the proposed change is to skip adding noundef attribute to arguments when a function has been tagged with convergent attribute.

Can we please have an example for this. I don't know what would be broken w/ noundef + convergent and I somewhat doubt noundef is the problem.

For the below source kernel from hypre, the optimisation by simplifyCFG pass caused issue with kernel execution on GPU.
We figured out that enabling noudef analysis by default is triggering this optimization.

source kernel:
Note: variable t is uninitialised intially and gets initialiazed when lane is 0.
void kernel{

double t, measure_row;
int lane = hypre_cuda_get_lane_id<1>();

...

if (lane == 0) {t = read_only_load(measure_diag + row);}
measure_row = __shfl_sync(HYPRE_WARP_FULL_MASK, t, 0);

...
}

Example LLVM IR for a similar scenario:
define void @func(i32 noundef %arg17) {
bb1:

%i1 = icmp eq i32 %arg17, 0
br i1 %i1, label %bb2, label %bb3

bb2: ; preds = %bb1

%i2 = call noundef double @read_only_load()
br label %bb3

bb3: ; preds = %bb2, %bb1

%i3 = phi double [ %i2, %bb2 ], [ undef, %bb1 ]
%i4 = call noundef double @__shfl_sync(double noundef %i3)
ret void

}

declare double @read_only_load()
declare double @__shfl_sync(double noundef) convergent

IR Dump After SimplifyCFGPass on func:
define void @func(i32 noundef %arg17) {
bb1:

%i1 = icmp eq i32 %arg17, 0
call void @llvm.assume(i1 %i1)
%i2 = call noundef double @read_only_load()
%i4 = call noundef double @__shfl_sync(double noundef %i2)
ret void

}

The issue you're describing sounds like it's specific to @__shfl_sync. In general, in C++, you aren't allowed to read from an uninitialized variable; see [basic.indet] in the standard. But if your testcase doesn't have undefined behavior, CUDA language rules must somehow allow this particular builtin function to take undef variables as input. (Is this documented somewhere?)

That isn't related to the "convergent" attribute; the transform you're describing doesn't break convergence rules.

In D124158#3477649, @efriedma wrote:

The issue you're describing sounds like it's specific to @__shfl_sync. In general, in C++, you aren't allowed to read from an uninitialized variable; see [basic.indet] in the standard. But if your testcase doesn't have undefined behavior, CUDA language rules must somehow allow this particular builtin function to take undef variables as input. (Is this documented somewhere?)

That isn't related to the "convergent" attribute; the transform you're describing doesn't break convergence rules.

I concur, especially on the last part. So far I have not seen why this is tied in any way to convergent. It might be a shfl oddity in which case the proper solution is to freeze all shuffle arguments in clang.
EDIT: https://godbolt.org/z/dnv63bzjn

As far as I know this is supposed to be a broadcast from lane 0 to every lane, so not sure why the control flow really matters

In D124158#3478319, @jdoerfert wrote:

The issue you're describing sounds like it's specific to @__shfl_sync. In general, in C++, you aren't allowed to read from an uninitialized variable; see [basic.indet] in the standard. But if your testcase doesn't have undefined behavior, CUDA language rules must somehow allow this particular builtin function to take undef variables as input. (Is this documented somewhere?)

That isn't related to the "convergent" attribute; the transform you're describing doesn't break convergence rules.

I concur, especially on the last part. So far I have not seen why this is tied in any way to convergent. It might be a shfl oddity in which case the proper solution is to freeze all shuffle arguments in clang.
EDIT: https://godbolt.org/z/dnv63bzjn

I'm thinking noundef is a bit of red herring here. The real problem seems to be arising from the assume call which is inserted, which now introduces the assumption that the lane ID must be 0

skc7 added a reviewer: nhaehnle.Apr 28 2022, 9:50 AM

In D124158#3480384, @arsenm wrote:

I'm thinking noundef is a bit of red herring here. The real problem seems to be arising from the assume call which is inserted, which now introduces the assumption that the lane ID must be 0

The optimizer is creating the llvm.assume call based on the violation of the noundef attribute.

In D124158#3480566, @efriedma wrote:

In D124158#3480384, @arsenm wrote:

I'm thinking noundef is a bit of red herring here. The real problem seems to be arising from the assume call which is inserted, which now introduces the assumption that the lane ID must be 0

The optimizer is creating the llvm.assume call based on the violation of the noundef attribute.

I agree. As far as I can tell you have two options, both are specific to the shuffle functions:

Do not set noundef for calls to them as they allow undef values for all lanes we don't read the value.
Freeze the inputs unconditionally.

Convergent is unrelated to this.

In D124158#3486110, @jdoerfert wrote:

I agree. As far as I can tell you have two options, both are specific to the shuffle functions:

Do not set noundef for calls to them as they allow undef values for all lanes we don't read the value.

Freeze the inputs unconditionally.

Right, with some nitpicks. Option #1 is semantically more accurate: __shfl_sync, subgroupShuffe, and all similar instructions across GPU programming languages are meant to be conceptually similar to select in that they select a value from a lane. The "data" argument and return value should allow undef and poison. The incoming value is simply returned as-is, and so poison is propagated instead of causing immediate UB, just as it is for a select instruction. However, the lane argument can (and arguably should) still be noundef.

There's a separate curious issue in that apparently, reading from an uninitialized variable is not UB in CUDA/HIP/GLSL/HLSL/etc. If it was, a lot of code existing out there in the wild would be broken. But that's a matter for the relevant language standards to decide (for the subset of languages that have proper standards to begin with).

skc7 mentioned this in D125378: [Attribute] Introduce shuffle attribute to be used for __shfl_sync like cross-lane APIs.May 11 2022, 5:14 AM

This is obsolete and can be abandoned

Herald added a subscriber: kosarev. · View Herald TranscriptNov 16 2022, 3:40 PM

skc7 abandoned this revision.Nov 16 2022, 9:47 PM

Large Diff

This large diff affects 106 files. Files without inline comments have been collapsed. Expand All Files

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGCall.cpp

11 lines

test/

CodeGen/

PowerPC/

ppc64le-varargs-f128.c

2 lines

CodeGenCUDA/

address-spaces.cu

2 lines

builtins-amdgcn.cu

2 lines

cuda-builtin-vars.cu

2 lines

kernel-args-alignment.cu

2 lines

8 lines

4 lines

2 lines

4 lines

8 lines

usual-deallocators.cu

35 lines

vtbl.cu

2 lines

CodeGenCUDASPIRV/

kernel-argument.cu

3 lines

CodeGenHIP/

hipspv-addr-spaces.cpp

10 lines

CodeGenOpenCL/

addr-space-struct-arg.cl

20 lines

address-spaces.cl

10 lines

amdgcn-automatic-variable.cl

8 lines

amdgpu-abi-struct-coerce.cl

48 lines

amdgpu-call-kernel.cl

2 lines

amdgpu-printf.cl

6 lines

as_type.cl

26 lines

atomic-ops-libcall.cl

54 lines

blocks.cl

12 lines

byval.cl

4 lines

const-str-array-decay.cl

2 lines

constant-addr-space-globals.cl

2 lines

convergent.cl

6 lines

fpmath.cl

4 lines

half.cl

8 lines

kernel-param-alignment.cl

12 lines

kernels-have-spir-cc-by-default.cl

8 lines

4 lines

20 lines

60 lines

10 lines

CodeGenOpenCLCXX/

address-space-deduction.clcpp

2 lines

addrspace-derived-base.clcpp

4 lines

addrspace-new-delete.clcpp

2 lines

addrspace-of-this.clcpp

32 lines

addrspace-operators.clcpp

4 lines

addrspace-references.clcpp

2 lines

addrspace-with-class.clcpp

22 lines

template-address-spaces.clcpp

6 lines

CodeGenSYCL/

address-space-conversions.cpp

52 lines

address-space-mangling.cpp

16 lines

functionptr-addrspace.cpp

2 lines

unique_stable_name.cpp

40 lines

OpenMP/

amdgcn-attributes.cpp

2 lines

amdgcn_target_global_constructor.cpp

16 lines

assumes_include_nvptx.cpp

6 lines

declare_target_codegen.cpp

4 lines

declare_target_codegen_globalization.cpp

12 lines

declare_target_link_codegen.cpp

2 lines

declare_variant_mixed_codegen.c

4 lines

distribute_codegen.cpp

80 lines

distribute_simd_codegen.cpp

160 lines

nvptx_allocate_codegen.cpp

8 lines

nvptx_data_sharing.cpp

8 lines

nvptx_declare_target_var_ctor_dtor_codegen.cpp

26 lines

nvptx_declare_variant_name_mangling.cpp

4 lines

nvptx_distribute_parallel_generic_mode_codegen.cpp

48 lines

nvptx_lambda_capturing.cpp

84 lines

nvptx_multi_target_parallel_codegen.cpp

18 lines

nvptx_nested_parallel_codegen.cpp

72 lines

nvptx_parallel_codegen.cpp

52 lines

nvptx_parallel_for_codegen.cpp

6 lines

nvptx_target_firstprivate_codegen.cpp

8 lines

nvptx_target_parallel_codegen.cpp

48 lines

nvptx_target_parallel_num_threads_codegen.cpp

48 lines

nvptx_target_parallel_reduction_codegen.cpp

18 lines

nvptx_target_printf_codegen.c

4 lines

nvptx_target_teams_codegen.cpp

48 lines

nvptx_target_teams_distribute_codegen.cpp

18 lines

nvptx_target_teams_distribute_parallel_for_codegen.cpp

144 lines

nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp

72 lines

nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp

72 lines

nvptx_teams_codegen.cpp

32 lines

nvptx_teams_reduction_codegen.cpp

162 lines

nvptx_unsupported_type_codegen.cpp

4 lines

openmp_offload_codegen.cpp

2 lines

reduction_implicit_map.cpp

8 lines

target_firstprivate_codegen.cpp

12 lines

target_parallel_codegen.cpp

208 lines

target_parallel_debug_codegen.cpp

24 lines

target_parallel_for_codegen.cpp

224 lines

target_parallel_for_debug_codegen.cpp

24 lines

target_parallel_for_simd_codegen.cpp

224 lines

target_parallel_if_codegen.cpp

176 lines

target_parallel_num_threads_codegen.cpp

176 lines

target_private_codegen.cpp

4 lines

target_reduction_codegen.cpp

2 lines

target_teams_codegen.cpp

352 lines

target_teams_distribute_codegen.cpp

224 lines

target_teams_distribute_parallel_for_codegen.cpp

48 lines

target_teams_distribute_parallel_for_firstprivate_codegen.cpp

456 lines

target_teams_distribute_parallel_for_private_codegen.cpp

262 lines

target_teams_distribute_parallel_for_simd_codegen.cpp

48 lines

target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp

456 lines

target_teams_distribute_parallel_for_simd_private_codegen.cpp

262 lines

target_teams_distribute_simd_codegen.cpp

224 lines

target_teams_map_codegen.cpp

184 lines

target_teams_num_teams_codegen.cpp

176 lines

target_teams_thread_limit_codegen.cpp

176 lines

teams_codegen.cpp

64 lines

llvm/

test/

Transforms/

SimplifyCFG/

tautological-conditional-branch-convergent-noundef.ll

31 lines

Diff 425463

clang/lib/CodeGen/CGCall.cpp

Show First 20 Lines • Show All 2,292 Lines • ▼ Show 20 Lines	void CodeGenModule::ConstructAttributeList(StringRef Name,
HasStrictReturn &= getCodeGenOpts().StrictReturn \|\|		HasStrictReturn &= getCodeGenOpts().StrictReturn \|\|
!MayDropFunctionReturn(getContext(), RetTy) \|\|		!MayDropFunctionReturn(getContext(), RetTy) \|\|
getLangOpts().Sanitize.has(SanitizerKind::Memory) \|\|		getLangOpts().Sanitize.has(SanitizerKind::Memory) \|\|
getLangOpts().Sanitize.has(SanitizerKind::Return);		getLangOpts().Sanitize.has(SanitizerKind::Return);

// Determine if the return type could be partially undef		// Determine if the return type could be partially undef
if (CodeGenOpts.EnableNoundefAttrs && HasStrictReturn) {		if (CodeGenOpts.EnableNoundefAttrs && HasStrictReturn) {
if (!RetTy->isVoidType() && RetAI.getKind() != ABIArgInfo::Indirect &&		if (!RetTy->isVoidType() && RetAI.getKind() != ABIArgInfo::Indirect &&
DetermineNoUndef(RetTy, getTypes(), DL, RetAI))		DetermineNoUndef(RetTy, getTypes(), DL, RetAI)) {
		// Skip adding noundef attribute when function has convergent attribute.
		if (!FuncAttrs.contains(llvm::Attribute::Convergent))
RetAttrs.addAttribute(llvm::Attribute::NoUndef);		RetAttrs.addAttribute(llvm::Attribute::NoUndef);
}		}
		}

switch (RetAI.getKind()) {		switch (RetAI.getKind()) {
case ABIArgInfo::Extend:		case ABIArgInfo::Extend:
if (RetAI.isSignExt())		if (RetAI.isSignExt())
RetAttrs.addAttribute(llvm::Attribute::SExt);		RetAttrs.addAttribute(llvm::Attribute::SExt);
else		else
RetAttrs.addAttribute(llvm::Attribute::ZExt);		RetAttrs.addAttribute(llvm::Attribute::ZExt);
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines	if (IRFunctionArgs.hasPaddingArg(ArgNo)) {
getLLVMContext(),		getLLVMContext(),
llvm::AttrBuilder(getLLVMContext()).addAttribute(llvm::Attribute::InReg));		llvm::AttrBuilder(getLLVMContext()).addAttribute(llvm::Attribute::InReg));
}		}
}		}

// Decide whether the argument we're handling could be partially undef		// Decide whether the argument we're handling could be partially undef
if (CodeGenOpts.EnableNoundefAttrs &&		if (CodeGenOpts.EnableNoundefAttrs &&
DetermineNoUndef(ParamType, getTypes(), DL, AI)) {		DetermineNoUndef(ParamType, getTypes(), DL, AI)) {
		// Skip adding noundef attribute to arguments when function has convergent attribute.
		if (!FuncAttrs.contains(llvm::Attribute::Convergent))
Attrs.addAttribute(llvm::Attribute::NoUndef);		Attrs.addAttribute(llvm::Attribute::NoUndef);
		arsenmUnsubmitted Not Done Reply Inline Actions Missing space before (. Needs comment explaining this arsenm: Missing space before (. Needs comment explaining this
		skc7AuthorUnsubmitted Done Reply Inline Actions Updated skc7: Updated
}		}

// 'restrict' -> 'noalias' is done in EmitFunctionProlog when we		// 'restrict' -> 'noalias' is done in EmitFunctionProlog when we
// have the corresponding parameter variable. It doesn't make		// have the corresponding parameter variable. It doesn't make
// sense to do it here because parameters are so messed up.		// sense to do it here because parameters are so messed up.
switch (AI.getKind()) {		switch (AI.getKind()) {
case ABIArgInfo::Extend:		case ABIArgInfo::Extend:
if (AI.isSignExt())		if (AI.isSignExt())
▲ Show 20 Lines • Show All 3,163 Lines • Show Last 20 Lines

clang/test/CodeGen/PowerPC/ppc64le-varargs-f128.c

Load File

clang/test/CodeGenCUDA/address-spaces.cu

Load File

clang/test/CodeGenCUDA/builtins-amdgcn.cu

Load File

clang/test/CodeGenCUDA/cuda-builtin-vars.cu

Load File

clang/test/CodeGenCUDA/kernel-args-alignment.cu

Load File

clang/test/CodeGenCUDA/kernel-args.cu

Load File

clang/test/CodeGenCUDA/lambda.cu

Load File

clang/test/CodeGenCUDA/redux-builtins.cu

Load File

clang/test/CodeGenCUDA/surface.cu

Load File

clang/test/CodeGenCUDA/unnamed-types.cu

Load File

clang/test/CodeGenCUDA/usual-deallocators.cu

Load File

clang/test/CodeGenCUDA/vtbl.cu

Load File

clang/test/CodeGenCUDASPIRV/kernel-argument.cu

Load File

clang/test/CodeGenHIP/hipspv-addr-spaces.cpp

Load File

clang/test/CodeGenOpenCL/addr-space-struct-arg.cl

Load File

clang/test/CodeGenOpenCL/address-spaces.cl

Load File

clang/test/CodeGenOpenCL/amdgcn-automatic-variable.cl

Load File

clang/test/CodeGenOpenCL/amdgpu-abi-struct-coerce.cl

Load File

clang/test/CodeGenOpenCL/amdgpu-call-kernel.cl

Load File

clang/test/CodeGenOpenCL/amdgpu-printf.cl

Load File

clang/test/CodeGenOpenCL/as_type.cl

Load File

clang/test/CodeGenOpenCL/atomic-ops-libcall.cl

Load File

clang/test/CodeGenOpenCL/blocks.cl

Load File

clang/test/CodeGenOpenCL/byval.cl

Load File

clang/test/CodeGenOpenCL/const-str-array-decay.cl

Load File

clang/test/CodeGenOpenCL/constant-addr-space-globals.cl

Load File

clang/test/CodeGenOpenCL/convergent.cl

Load File

clang/test/CodeGenOpenCL/fpmath.cl

Load File

clang/test/CodeGenOpenCL/half.cl

Load File

clang/test/CodeGenOpenCL/kernel-param-alignment.cl

Load File

clang/test/CodeGenOpenCL/kernels-have-spir-cc-by-default.cl

Load File

clang/test/CodeGenOpenCL/no-half.cl

Load File

clang/test/CodeGenOpenCL/overload.cl

Load File

clang/test/CodeGenOpenCL/size_t.cl

Load File

clang/test/CodeGenOpenCL/spir-calling-conv.cl

Load File

clang/test/CodeGenOpenCLCXX/address-space-deduction.clcpp

Load File

clang/test/CodeGenOpenCLCXX/addrspace-derived-base.clcpp

Load File

clang/test/CodeGenOpenCLCXX/addrspace-new-delete.clcpp

Load File

clang/test/CodeGenOpenCLCXX/addrspace-of-this.clcpp

Load File

clang/test/CodeGenOpenCLCXX/addrspace-operators.clcpp

Load File

clang/test/CodeGenOpenCLCXX/addrspace-references.clcpp

Load File

clang/test/CodeGenOpenCLCXX/addrspace-with-class.clcpp

Load File

clang/test/CodeGenOpenCLCXX/template-address-spaces.clcpp

Load File

clang/test/CodeGenSYCL/address-space-conversions.cpp

Load File

clang/test/CodeGenSYCL/address-space-mangling.cpp

Load File

clang/test/CodeGenSYCL/functionptr-addrspace.cpp

Load File

clang/test/CodeGenSYCL/unique_stable_name.cpp

Load File

clang/test/OpenMP/amdgcn-attributes.cpp

Load File

clang/test/OpenMP/amdgcn_target_global_constructor.cpp

Load File

clang/test/OpenMP/assumes_include_nvptx.cpp

Load File

clang/test/OpenMP/declare_target_codegen.cpp

Load File

clang/test/OpenMP/declare_target_codegen_globalization.cpp

Load File

clang/test/OpenMP/declare_target_link_codegen.cpp

Load File

clang/test/OpenMP/declare_variant_mixed_codegen.c

Load File

clang/test/OpenMP/distribute_codegen.cpp

Load File

clang/test/OpenMP/distribute_simd_codegen.cpp

Load File

clang/test/OpenMP/nvptx_allocate_codegen.cpp

Load File

clang/test/OpenMP/nvptx_data_sharing.cpp

Load File

clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp

Load File

clang/test/OpenMP/nvptx_declare_variant_name_mangling.cpp

Load File

clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp

Load File

clang/test/OpenMP/nvptx_lambda_capturing.cpp

Load File

clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp

Load File

clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp

Load File

clang/test/OpenMP/nvptx_parallel_codegen.cpp

Load File

clang/test/OpenMP/nvptx_parallel_for_codegen.cpp

Load File

clang/test/OpenMP/nvptx_target_firstprivate_codegen.cpp

Load File

clang/test/OpenMP/nvptx_target_parallel_codegen.cpp

Load File

clang/test/OpenMP/nvptx_target_parallel_num_threads_codegen.cpp

Load File

clang/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp

Load File

clang/test/OpenMP/nvptx_target_printf_codegen.c

Load File

clang/test/OpenMP/nvptx_target_teams_codegen.cpp

Load File

clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp

Load File

clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp

Load File

clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp

Load File

clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp

Load File

clang/test/OpenMP/nvptx_teams_codegen.cpp

Load File

clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp

Load File

clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp

Load File

clang/test/OpenMP/openmp_offload_codegen.cpp

Load File

clang/test/OpenMP/reduction_implicit_map.cpp

Load File

clang/test/OpenMP/target_firstprivate_codegen.cpp

Load File

clang/test/OpenMP/target_parallel_codegen.cpp

Load File

clang/test/OpenMP/target_parallel_debug_codegen.cpp

Load File

clang/test/OpenMP/target_parallel_for_codegen.cpp

Load File

clang/test/OpenMP/target_parallel_for_debug_codegen.cpp

Load File

clang/test/OpenMP/target_parallel_for_simd_codegen.cpp

Load File

clang/test/OpenMP/target_parallel_if_codegen.cpp

Load File

clang/test/OpenMP/target_parallel_num_threads_codegen.cpp

Load File

clang/test/OpenMP/target_private_codegen.cpp

Load File

clang/test/OpenMP/target_reduction_codegen.cpp

Load File

clang/test/OpenMP/target_teams_codegen.cpp

Load File

clang/test/OpenMP/target_teams_distribute_codegen.cpp

Load File

clang/test/OpenMP/target_teams_distribute_parallel_for_codegen.cpp

Load File

clang/test/OpenMP/target_teams_distribute_parallel_for_firstprivate_codegen.cpp

Load File

clang/test/OpenMP/target_teams_distribute_parallel_for_private_codegen.cpp

Load File

clang/test/OpenMP/target_teams_distribute_parallel_for_simd_codegen.cpp

Load File

clang/test/OpenMP/target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp

Load File

clang/test/OpenMP/target_teams_distribute_parallel_for_simd_private_codegen.cpp

Load File

clang/test/OpenMP/target_teams_distribute_simd_codegen.cpp

Load File

clang/test/OpenMP/target_teams_map_codegen.cpp

Load File

clang/test/OpenMP/target_teams_num_teams_codegen.cpp

Load File

clang/test/OpenMP/target_teams_thread_limit_codegen.cpp

Load File

clang/test/OpenMP/teams_codegen.cpp

Load File

llvm/test/Transforms/SimplifyCFG/tautological-conditional-branch-convergent-noundef.ll

This file was added.

				; RUN: opt < %s -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -S \| FileCheck %s

				define void @func(i32 %arg17) {
				; CHECK-LABEL: @func(
				; CHECK-LABEL: bb1:
				; CHECK-NEXT: [[I:%.]] = icmp eq i32 [[ARG17:%.]], 0
				; CHECK-NEXT: br i1 [[I:%.]], label [[BB2:%.]], label [[BB3:%.*]]
				; CHECK-LABEL: bb2:
				; CHECK-NEXT: [[I2:%.*]] = call noundef double @one()
				; CHECK-NEXT: br label [[BB3:%.*]]
				; CHECK-LABEL: bb3:
				; CHECK-NEXT: [[I3:%.]] = phi double [ [[I2:%.]], [[BB2:%.]] ], [ undef, [[BB1:%.]] ]
				; CHECK-NEXT: [[I4:%.]] = call double @two(double [[I3:%.]], i1 [[I:%.*]])
				; CHECK-NEXT: ret void

				bb1:
				%i1 = icmp eq i32 %arg17, 0
				br i1 %i1, label %bb2, label %bb3

				bb2: ; preds = %bb1
				%i2 = call noundef double @one()
				br label %bb3

				bb3: ; preds = %bb2, %bb1
				%i3 = phi double [%i2, %bb2], [undef, %bb1]
				%i4 = call double @two(double %i3, i1 %i1)
				ret void
				arsenmUnsubmitted Not Done Reply Inline Actions Aren't the cases with defined handling of undef lanes still defined for the result? arsenm: Aren't the cases with defined handling of undef lanes still defined for the result?
				skc7AuthorUnsubmitted Not Done Reply Inline Actions ret double %i4? skc7: ret double %i4?
				skc7AuthorUnsubmitted Done Reply Inline Actions Updated patch to skip adding noundef attribute to return types aswell skc7: Updated patch to skip adding noundef attribute to return types aswell
				}

				declare double @one()
				declare double @two(double, i1) convergent
				No newline at end of file

This is an archive of the discontinued LLVM Phabricator instance.

[Clang][Attr] Skip adding noundef attribute to arguments when function has convergent attributeAbandonedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Large Diff

Revision Contents

Diff 425463

clang/lib/CodeGen/CGCall.cpp

clang/test/CodeGen/PowerPC/ppc64le-varargs-f128.c

clang/test/CodeGenCUDA/address-spaces.cu

clang/test/CodeGenCUDA/builtins-amdgcn.cu

clang/test/CodeGenCUDA/cuda-builtin-vars.cu

clang/test/CodeGenCUDA/kernel-args-alignment.cu

clang/test/CodeGenCUDA/kernel-args.cu

clang/test/CodeGenCUDA/lambda.cu

clang/test/CodeGenCUDA/redux-builtins.cu

clang/test/CodeGenCUDA/surface.cu

clang/test/CodeGenCUDA/unnamed-types.cu

clang/test/CodeGenCUDA/usual-deallocators.cu

clang/test/CodeGenCUDA/vtbl.cu

clang/test/CodeGenCUDASPIRV/kernel-argument.cu

clang/test/CodeGenHIP/hipspv-addr-spaces.cpp

clang/test/CodeGenOpenCL/addr-space-struct-arg.cl

clang/test/CodeGenOpenCL/address-spaces.cl

clang/test/CodeGenOpenCL/amdgcn-automatic-variable.cl

clang/test/CodeGenOpenCL/amdgpu-abi-struct-coerce.cl

clang/test/CodeGenOpenCL/amdgpu-call-kernel.cl

clang/test/CodeGenOpenCL/amdgpu-printf.cl

clang/test/CodeGenOpenCL/as_type.cl

clang/test/CodeGenOpenCL/atomic-ops-libcall.cl

clang/test/CodeGenOpenCL/blocks.cl

clang/test/CodeGenOpenCL/byval.cl

clang/test/CodeGenOpenCL/const-str-array-decay.cl

clang/test/CodeGenOpenCL/constant-addr-space-globals.cl

clang/test/CodeGenOpenCL/convergent.cl

clang/test/CodeGenOpenCL/fpmath.cl

clang/test/CodeGenOpenCL/half.cl

clang/test/CodeGenOpenCL/kernel-param-alignment.cl

clang/test/CodeGenOpenCL/kernels-have-spir-cc-by-default.cl

clang/test/CodeGenOpenCL/no-half.cl

clang/test/CodeGenOpenCL/overload.cl

clang/test/CodeGenOpenCL/size_t.cl

clang/test/CodeGenOpenCL/spir-calling-conv.cl

clang/test/CodeGenOpenCLCXX/address-space-deduction.clcpp

clang/test/CodeGenOpenCLCXX/addrspace-derived-base.clcpp

clang/test/CodeGenOpenCLCXX/addrspace-new-delete.clcpp

clang/test/CodeGenOpenCLCXX/addrspace-of-this.clcpp

clang/test/CodeGenOpenCLCXX/addrspace-operators.clcpp

clang/test/CodeGenOpenCLCXX/addrspace-references.clcpp

clang/test/CodeGenOpenCLCXX/addrspace-with-class.clcpp

clang/test/CodeGenOpenCLCXX/template-address-spaces.clcpp

clang/test/CodeGenSYCL/address-space-conversions.cpp

clang/test/CodeGenSYCL/address-space-mangling.cpp

clang/test/CodeGenSYCL/functionptr-addrspace.cpp

clang/test/CodeGenSYCL/unique_stable_name.cpp

clang/test/OpenMP/amdgcn-attributes.cpp

clang/test/OpenMP/amdgcn_target_global_constructor.cpp

clang/test/OpenMP/assumes_include_nvptx.cpp

clang/test/OpenMP/declare_target_codegen.cpp

clang/test/OpenMP/declare_target_codegen_globalization.cpp

clang/test/OpenMP/declare_target_link_codegen.cpp

clang/test/OpenMP/declare_variant_mixed_codegen.c

clang/test/OpenMP/distribute_codegen.cpp

clang/test/OpenMP/distribute_simd_codegen.cpp

clang/test/OpenMP/nvptx_allocate_codegen.cpp

clang/test/OpenMP/nvptx_data_sharing.cpp

clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp

clang/test/OpenMP/nvptx_declare_variant_name_mangling.cpp

clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp

clang/test/OpenMP/nvptx_lambda_capturing.cpp

clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp

clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp

clang/test/OpenMP/nvptx_parallel_codegen.cpp

clang/test/OpenMP/nvptx_parallel_for_codegen.cpp

clang/test/OpenMP/nvptx_target_firstprivate_codegen.cpp

clang/test/OpenMP/nvptx_target_parallel_codegen.cpp

clang/test/OpenMP/nvptx_target_parallel_num_threads_codegen.cpp

clang/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp

[Clang][Attr] Skip adding noundef attribute to arguments when function has convergent attribute
AbandonedPublic