This is an archive of the discontinued LLVM Phabricator instance.

Reword lifetime description to avoid contradicting long term implementation
AbandonedPublic

Authored by reames on Mar 24 2021, 2:51 PM.

Download Raw Diff

Details

Reviewers

aqjune
nlopes
bollu

Summary

This is effectively a rework of c821ef451. That change had a major problem in that it contradicted our long standing dereferenceability implementation and turned LICM into an unsound transform. On the other hand, some of the wording clarifications were helpful. So, instead of a whole sale revert, I'm posting a potential alternative.

The core notions here are: 1) separate dereferenceability and object lifetime (in the marker intrinsic sense), and 2) unify handling for all memory object types.

If you read the existing StackColoring.cpp implementation, the degenerate slot section is describing something close to the proposed split here.

I added parties I know to be interested in this area as reviewers. I would appreciate if one of you would pick up this patch and drive it. I don't actually care about the lifetime markers at all. I'm simply posting this to try and drive the discussion into a productive direction.

Diff Detail

Event Timeline

reames created this revision.Mar 24 2021, 2:51 PM

Herald added a reviewer: bollu. · View Herald TranscriptMar 24 2021, 2:51 PM

Herald added subscribers: jdoerfert, dantrushin, mcrosier. · View Herald Transcript

reames requested review of this revision.Mar 24 2021, 2:51 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2021, 2:51 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Hi,

The core notions here are: 1) separate dereferenceability and object lifetime (in the marker intrinsic sense), and 2) unify handling for all memory object types.

I think it is good to separate the two concerns: the second issue already led a long discussion in llvm-dev (this Jan), and it didn't converge as well.
The first issue seems important ATM, so let's make this patch concentrate on that. This will help converge discussion shortly.
It seems enough to update "Object Lifetime" section to address the dereferenceability issue, what do you think?

D99135 and the llvm-commit mails have many valuable issues recorded about dereferenceability of a dead stack-allocated object, so I'd prefer shortly mentioning the issues here to not lose it.
Especially, store to an stack-allocated object that is dead may overwrite another stack object's value <- this seems pretty important to me.

llvm/docs/LangRef.rst
2591	I think this dereferenceability issue is more related to stack-allocated objects than heap/global objects. I couldn't write a program/find an existing optimization that has the dereferenceability issue with heap/global objects. Do you have any? If it is, what do you think about changing this paragraph instead of the above paragraph to say everything that should be addressed? This paragraph is a good place to talk about stack thingy.

Harbormaster completed remote builds in B95591: Diff 333136.Mar 24 2021, 10:52 PM

Macro point: please consider everything in this response as advisory, and non-blocking. As mentioned in the related discussion in the commit thread for the original patch, I am planning on removing myself from this discussion going forward, and nothing said here should be considered blocking. If it's useful, great. If not, feel free to ignore.

In D99303#2649619, @aqjune wrote:

Hi,

The core notions here are: 1) separate dereferenceability and object lifetime (in the marker intrinsic sense), and 2) unify handling for all memory object types.

I think it is good to separate the two concerns: the second issue already led a long discussion in llvm-dev (this Jan), and it didn't converge as well.

I'm not sure this is really possible. The example which concerns me is what if we have a lifetime end buried in a callee? As the current semantics are written, the behavior of such a call depends on the underlying allocation type. This makes it really hard to either a) reason about the semantics of the function in isolation, or b) perform optimization if we're forced to assume the conservative union of the handling.

The first issue seems important ATM, so let's make this patch concentrate on that. This will help converge discussion shortly.
It seems enough to update "Object Lifetime" section to address the dereferenceability issue, what do you think?

I am in general very hesitant about this.

My primary concern is not that the semantics are self consistent or not. My concern is that the semantics seem not to match the existing implementation in the optimizer. (Specifically, see the hoisting example I gave elsewhere.)

I can see three possible paths to resolution:

change langref in manner to match implementation
change implementation to match new langref
make an argument that case described is impossible to construct (and thus implementation does actually match new langref wording)

I don't have any preference as to which path is taken. I'd tend to default to option 1, but that's just what I'd do, not the only reasonable path forward.

D99135 and the llvm-commit mails have many valuable issues recorded about dereferenceability of a dead stack-allocated object, so I'd prefer shortly mentioning the issues here to not lose it.
Especially, store to an stack-allocated object that is dead may overwrite another stack object's value <- this seems pretty important to me.

Fair, I don't really have a good suggestion as to how to resolve this.

Abandoning change. As noted in the original commit thread, I'm removing myself from further discussion on this topic. Feel free to use anything in this change set which is useful.

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

55 lines

Diff 333136

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 2,570 Lines • ▼ Show 20 Lines
	reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap			reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap
	allocation calls, and global variable definitions.			allocation calls, and global variable definitions.
	Once it is allocated, the bytes stored in the region can only be read or written			Once it is allocated, the bytes stored in the region can only be read or written
	through a pointer that is :ref:`based on <pointeraliasing>` the allocation			through a pointer that is :ref:`based on <pointeraliasing>` the allocation
	value.			value.
	If a pointer that is not based on the object tries to read or write to the			If a pointer that is not based on the object tries to read or write to the
	object, it is undefined behavior.			object, it is undefined behavior.

	A lifetime of a memory object is a property that decides its accessibility.			It is undefined behavior to access a memory object that has been deallocated,
	Unless stated otherwise, a memory object is alive since its allocation, and			but operations that don't dereference it such as
	dead after its deallocation.
	It is undefined behavior to access a memory object that isn't alive, but
	operations that don't dereference it such as
	:ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and			:ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and
	:ref:`icmp <i_icmp>` return a valid result.			:ref:`icmp <i_icmp>` return a valid result. This allows code motion
	This explains code motion of these instructions across operations that			of such instructions across potential deallocation points.
	impact the object's lifetime.
				Distinct from allocation status, we also have the notion of memory object
				lifetime - an object can be either dead or live.
				All reads from a dead object return ``undef`` and all writes are ignored.
				Unless stated otherwise, a memory object is alive after its allocation, and
				dead after its deallocation.
	A stack object's lifetime can be explicitly specified using			A stack object's lifetime can be explicitly specified using
	:ref:`llvm.lifetime.start <int_lifestart>` and			:ref:`llvm.lifetime.start <int_lifestart>` and
	:ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls.			:ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls.

	aqjuneUnsubmitted Not Done Reply Inline Actions I think this dereferenceability issue is more related to stack-allocated objects than heap/global objects. I couldn't write a program/find an existing optimization that has the dereferenceability issue with heap/global objects. Do you have any? If it is, what do you think about changing this paragraph instead of the above paragraph to say everything that should be addressed? This paragraph is a good place to talk about stack thingy. aqjune: I think this dereferenceability issue is more related to stack-allocated objects than…
				Lifetime is essentially a hint to the optimizer that moving allocation or
				deallocation to the liveness transitions may be profitable.
				The optimizer is still responsible for establishing legality - e.g. a
				store to dead object can't fault, unless the object has been deallocated

	.. _pointeraliasing:			.. _pointeraliasing:

	Pointer Aliasing Rules			Pointer Aliasing Rules
	----------------------			----------------------

	Any memory access must be done through a pointer value associated with			Any memory access must be done through a pointer value associated with
	an address range of the memory access, otherwise the behavior is			an address range of the memory access, otherwise the behavior is
	undefined. Pointer values are associated with address ranges according			undefined. Pointer values are associated with address ranges according
	▲ Show 20 Lines • Show All 15,690 Lines • ▼ Show 20 Lines

	The first argument is a constant integer representing the size of the			The first argument is a constant integer representing the size of the
	object, or -1 if it is variable sized. The second argument is a pointer			object, or -1 if it is variable sized. The second argument is a pointer
	to the object.			to the object.

	Semantics:			Semantics:
	""""""""""			""""""""""

	If ``ptr`` is a stack-allocated object and it points to the first byte of			After '``llvm.lifetime.start``', the object that ``ptr`` points is marked
	the object, the object is initially marked as dead.			as alive and has an uninitialized value (e.g. ``undef``).
	``ptr`` is conservatively considered as a non-stack-allocated object if
	the stack coloring algorithm that is used in the optimization pipeline cannot
	conclude that ``ptr`` is a stack-allocated object.

	After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked
	as alive and has an uninitialized value.
	The stack object is marked as dead when either
	:ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the
	function returns.

	After :ref:`llvm.lifetime.end <int_lifeend>` is called,			After :ref:`llvm.lifetime.end <int_lifeend>` is called,
	'``llvm.lifetime.start``' on the stack object can be called again.			'``llvm.lifetime.start``' on the object can be called again.
	The second '``llvm.lifetime.start``' call marks the object as alive, but it			The second '``llvm.lifetime.start``' call marks the object as alive, but it
	does not change the address of the object.			does not change the address of the object.

	If ``ptr`` is a non-stack-allocated object, it does not point to the first
	byte of the object or it is a stack object that is already alive, it simply
	fills all bytes of the object with ``poison``.


	.. _int_lifeend:			.. _int_lifeend:

	'``llvm.lifetime.end``' Intrinsic			'``llvm.lifetime.end``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	Show All 12 Lines

	The first argument is a constant integer representing the size of the			The first argument is a constant integer representing the size of the
	object, or -1 if it is variable sized. The second argument is a pointer			object, or -1 if it is variable sized. The second argument is a pointer
	to the object.			to the object.

	Semantics:			Semantics:
	""""""""""			""""""""""

	If ``ptr`` is a stack-allocated object and it points to the first byte of the			After '``llvm.lifetime.end``', the object that ``ptr`` points is marked
	object, the object is dead.			as dead and has an uninitialized value (e.g. ``undef``).
	``ptr`` is conservatively considered as a non-stack-allocated object if
	the stack coloring algorithm that is used in the optimization pipeline cannot
	conclude that ``ptr`` is a stack-allocated object.

	Calling ``llvm.lifetime.end`` on an already dead alloca is no-op.			Calling ``llvm.lifetime.end`` on an already dead alloca is no-op.

	If ``ptr`` is a non-stack-allocated object or it does not point to the first
	byte of the object, it is equivalent to simply filling all bytes of the object
	with ``poison``.


	'``llvm.invariant.start``' Intrinsic			'``llvm.invariant.start``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""
	This is an overloaded intrinsic. The memory object can belong to any address space.			This is an overloaded intrinsic. The memory object can belong to any address space.

	::			::
	▲ Show 20 Lines • Show All 3,530 Lines • Show Last 20 Lines