This is an archive of the discontinued LLVM Phabricator instance.

[LoopRotate] Fix lifetime handling.
AbandonedPublic

Authored by fhahn on Sep 20 2018, 8:01 AM.

Details

Summary

Currently LoopRotate ends up duplicating lifetime starts/ends in the
preheader, which creates invalid lifetimes.

This patch adds support for extending the lifetimes when hoisting
instructions: lifetime.start gets hoisted to the preheader and
lifetime.end gets sunk to the exit block. This extends the lifetime of
the memory location.

Diff Detail

Event Timeline

fhahn created this revision.Sep 20 2018, 8:01 AM
fhahn added a comment.Sep 20 2018, 8:59 AM

I think this fixes a longstanding bug report PR15376.

fhahn updated this revision to Diff 166310.Sep 20 2018, 9:08 AM

Move the lifetime handling before cloning the instruction. The cases where we delete the clone do not apply to lifetime intrinsics anyways, so there is no need to clone it in that case.

Duplicating lifetime.start and lifetime.end intrinsics isn't fundamentally a problem, as long as they stay correctly paired. What is this supposed to solve?

fhahn added a comment.Sep 20 2018, 1:29 PM

Duplicating lifetime.start and lifetime.end intrinsics isn't fundamentally a problem, as long as they stay correctly paired. What is this supposed to solve?

My understanding from the LangRef was that having 2 set of lifetimes for the same memory location may lead to unexpected behaviour. For lifetime.start, the semantics are described as

This intrinsic indicates that before this point in the code, the value of the memory pointed to by ptr is dead. This means that it is known to never be used and has an undefined value. A load from the pointer that precedes this intrinsic can be replaced with 'undef'.

So if we duplicate lifetimes and say we have

  %tmp = bitcast i32* %m to i8*
  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %tmp)
  %v1 = load i32* %m
  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %tmp)
  br label %bb2
.....
bb3:
  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %tmp)
  %v1 = load i32* %m
  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %tmp)
  br label %exit

Now when visiting bb3, we see the lifetime.start and can replace any loads that precedes it (%v1) with undef. lifetime.end says something similar about stores after lifetime.end. Maybe I am missing something from the LangRef?

This seems to be causing a runtime failure in our internal benchmarks.

An alloca can have multiple associated lifetime start/ends. Inlining routinely generates code like that, unrolling generates code like that, etc. And stack coloring knows how to handle such code.

Granted, the way the intrinsic is specified makes everything related to it pretty confusing.

fhahn added a comment.Sep 20 2018, 3:50 PM

An alloca can have multiple associated lifetime start/ends. Inlining routinely generates code like that, unrolling generates code like that, etc. And stack coloring knows how to handle such code.

Granted, the way the intrinsic is specified makes everything related to it pretty confusing.

Right! If everything is working as intended currently, I'd be happy to clarify the LangRef and track down where things go wrong with those lifetimes later.

I guess both interpretations of the lifetime behavior make sense, what's in the LangRef currently seems to make using the info provided by the intrinsics easier to use (no need to check for other lifetime markers to the same location), having multiple distinct ranges for the same location increases precision.

mkazantsev resigned from this revision.Nov 26 2018, 11:46 PM

Sorry, I don't think I can give any useful input on that.

reames resigned from this revision.Mar 25 2020, 11:18 AM
fhahn abandoned this revision.Jul 15 2020, 9:44 AM

I won't be able to follow up on this in the near future unfortunately.