This is an archive of the discontinued LLVM Phabricator instance.

InstCombine: Fix infinite loop in copy-constant-to-alloca transform
ClosedPublic

Authored by arsenm on Oct 5 2020, 2:04 PM.

Details

Summary

This was broken by 16295d521e294b27106e51fac29957c1aac8ff89, when
instructions started being handled and not just constant
expressions. This was re-inserting an equivalent bitcast to the
original memcpy operand, which made a non-functional IR change on
every iteration.

This also fixes a secondary problem where it was inserting
addrspacecasts which may not have been legal (i.e. it changed the
source address space). Start visiting all pointer users and fail out
if we can't process them. Also start handling the relevant memory
intrinsic users. These cases can be dealt with by running
InferAddressSpaces separately.

Diff Detail

Event Timeline

arsenm created this revision.Oct 5 2020, 2:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 5 2020, 2:04 PM
arsenm requested review of this revision.Oct 5 2020, 2:04 PM
jdoerfert resigned from this revision.Oct 10 2020, 8:39 AM

The IRBuilder stuff looks good, I don't know much about the transformation though collectUsers looks more reasonable then the old code.

rnk accepted this revision.Oct 12 2020, 10:05 AM

lgtm

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
326

I spent a while trying to figure out if this transform carried over the source location (the !dbg metadata). Setting the insertion point of the irbuilder happens to have the side effect of copying the location of of the instruction insertion point. Maybe that is common knowledge, but if not, I would consider a comment about it here.

This revision is now accepted and ready to land.Oct 12 2020, 10:05 AM
guyblank added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
268

Hi,
Encountered an issue with this in an out-of-tree target. Is this missing handling of lifetime start/end?

arsenm added inline comments.Mar 17 2021, 5:58 AM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
268

Yes. Do you have a testcase?

guyblank added inline comments.Mar 17 2021, 9:15 AM
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
268

adding lifetime start/end to the AMDGPU tests reproduces this.

  declare void @llvm.lifetime.start.p5i8(i64, i8 addrspace(5)* )
  declare void @llvm.lifetime.end.p5i8(i64, i8 addrspace(5)* )

; Simple memcpy to alloca from constant address space argument.
define i8 @memcpy_constant_arg_ptr_to_alloca([32 x i8] addrspace(4)* noalias readonly align 4 dereferenceable(32) %arg, i32 %idx) {
  %alloca = alloca [32 x i8], align 4, addrspace(5)
  %alloca.cast = bitcast [32 x i8] addrspace(5)* %alloca to i8 addrspace(5)*
  call void @llvm.lifetime.start.p5i8(i64 32, i8 addrspace(5)* %alloca.cast)
  %arg.cast = bitcast [32 x i8] addrspace(4)* %arg to i8 addrspace(4)*
  call void @llvm.memcpy.p5i8.p4i8.i64(i8 addrspace(5)* %alloca.cast, i8 addrspace(4)* %arg.cast, i64 32, i1 false)
  call void @llvm.lifetime.end.p5i8(i64 32, i8 addrspace(5)* %alloca.cast)
  %gep = getelementptr inbounds [32 x i8], [32 x i8] addrspace(5)* %alloca, i32 0, i32 %idx
  %load = load i8, i8 addrspace(5)* %gep
  ret i8 %load
}