This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
1/6
LangRef.rst
-
SourceLevelDebugging.rst

Differential D49572

[docs] Clarify role of DIExpressions within debug intrinsics
ClosedPublic

Authored by vsk on Jul 19 2018, 3:24 PM.

Download Raw Diff

Details

Reviewers

rnk
aprantl
bjope

Commits

rG8a05b01d133e: [docs] Clarify role of DIExpressions within debug intrinsics
rL338182: [docs] Clarify role of DIExpressions within debug intrinsics

Summary

This is an attempt to make the semantics of DIExpressions within
llvm.dbg.{addr, declare, value} easier to understand.

Diff Detail

Event Timeline

vsk created this revision.Jul 19 2018, 3:24 PM

rnk added inline comments.Jul 19 2018, 4:14 PM

docs/LangRef.rst
4613	What makes value operands to dbg.value implicit or concrete in LLVM IR? Are SSA values from local instructions concrete, and constants implicit? We could describe that here.

vsk added inline comments.Jul 19 2018, 4:26 PM

docs/LangRef.rst
4613	Sure. The way I read it, it depends on the DIType of the described variable. The value operand is concrete iff it's is a pointer to an instance of that DIType. So, the value operand in dbg.value(const-ptr-null, "int *p") is implicit, but concrete in dbg.value(const-ptr-null, "int"). At least, that's the only consistent explanation I've thought of. I don't know how the backend actually determines this. IIUC D49454/D49520 is an example of the backend getting this wrong: it treats a pointer to a std::deque as the implicit location of the std::deque.

bjope added inline comments.Jul 20 2018, 6:55 AM

docs/LangRef.rst
4605	Is it true that a debugger must be able to modify the variable for an `llvm.dbg.addr`? Any specific reason, or are we just trying to put limitations on the DIExpression in a `llvm.dbg.addr` intrinsic?
4613	My interpretation (with very little experience of `llvm.dbg.addr`) has been that `llvm.dbg.addr` is the IR version of an indirect DBG_VALUE. And `llvm.dbg.value` is the IR version of an non-indirect DBG_VALUE. At least that seems to be the difference in SelectionDAG. Afaict the first argument in a `dbg.value`, together with the DIExpression, describes the value of the variable. The first argument in `dbg.addr`, together with the DIExpression, describes the address of the variable. And I think the first argument in `dbg.value` should be treated as a value, and the first argument in `dbg.addr` should be treated as an indirect pointer. A DIExpression might be used both in dbg.declare, dbg.addr, dbg.value, direct DBG_VALUE and indirect DBG_VALUE, and it could be both tricky and confusing how to interpret the DIExpression. Depending on which intrinsic that is used, or if the DBG_VALUE is direct/indirect, the DIExpression could have an implied DW_OP_stack_value, DW_OP_deref, at the end (or even at the front?). As it might be hard to understand this, improving the documentation is a really nice initiative! One question is if we need to be able to indicate that there is an indirect value operand in a `dbg.value`. Or isn't it enough that if you for example want to describe a variables !Y:s value as (X[0] + 5), then you need to include a DW_OP_deref such as dbg.value(X, !Y, DIExpression(DW_OP_deref, DW_OP_constu 5, DW_OP_add)) The above will become a direct DBG_VALUE since `dbg.value` is used. The DW_OP_deref is needed since by default the first argument in `dbg.value` is treated as a value and not a pointer. The variable will be described using an "implicit location" (DWARF terminology). Are you even saying that depending on !Y it might be wrong to have the DW_OP_deref here? Btw, I think it is confusing to use "concrete" as terminology for the value operand. Isn't the question if the value operand is direct or indirect (if it is a value or a pointer)?

vsk updated this revision to Diff 156914.Jul 23 2018, 4:19 PM

vsk marked an inline comment as done.

vsk edited the summary of this revision. (Show Details)

vsk added inline comments.

docs/LangRef.rst
4605	No, I'll walk this back. It's valid to describe a read-only memory location. After thinking about it some more, I don't think there's really an issue with DW_OP_stack_value inside of a llvm.dbg.addr either.
4613	My first response to @rnk here was incorrect: implicit vs. concrete is not the same distinction as direct vs. indirect. The latter is the relevant distinction and it has nothing to do with DIType. I consider @bjope's description here to be the "common sense" one we all thought was correct: interpreting a dbg.value should give a direct value, and interpreting a dbg.{addr,declare} should give an indirect value. I'll update this patch to make those definitions precise. Basically, there should be exactly one way to interpret a DIExpression, without any implicit DW_OP_stack_value or DW_OP_deref added based on the context of which intrinsic / what type of location you have. Once we land the fix in D49454 I think we'll either actually have that model or be really close. Right now there is some special magic with non-empty DIExpressions, but I hope to eliminate that.

Minor wordsmithing.

lgtm

This revision is now accepted and ready to land.Jul 27 2018, 4:15 PM

Closed by commit rL338182: [docs] Clarify role of DIExpressions within debug intrinsics (authored by vedantk). · Explain WhyJul 27 2018, 5:34 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

docs/

LangRef.rst

40 lines

SourceLevelDebugging.rst

3 lines

Diff 156375

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 4,588 Lines • ▼ Show 20 Lines
	- ``DW_OP_swap`` swaps top two stack entries.			- ``DW_OP_swap`` swaps top two stack entries.
	- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top			- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
	of the stack is treated as an address. The second stack entry is treated as an			of the stack is treated as an address. The second stack entry is treated as an
	address space identifier.			address space identifier.
	- ``DW_OP_stack_value`` marks a constant value.			- ``DW_OP_stack_value`` marks a constant value.

	DWARF specifies three kinds of simple location descriptions: Register, memory,			DWARF specifies three kinds of simple location descriptions: Register, memory,
	and implicit location descriptions. Register and memory location descriptions			and implicit location descriptions. Register and memory location descriptions
	describe the location of a source variable (in the sense that a debugger might			describe the concrete location of a source variable (in the sense that a
	modify its value), whereas implicit locations describe merely the value of a			debugger might modify its value), whereas implicit locations describe merely
	source variable. DIExpressions also follow this model: A DIExpression that			the actual value of a source variable which might not exist in registers or
	doesn't have a trailing ``DW_OP_stack_value`` will describe an address when			in memory. Note that a location description is defined for certain ranges of a
	combined with a concrete location.			program, i.e the location of a variable may change over the course of the
				program.

				A DIExpression attached to a ``llvm.dbg.addr`` or ``llvm.dbg.declare``
				intrinsic describes the concrete location of a source variable. A debugger must
				bjopeUnsubmitted Done Reply Inline Actions Is it true that a debugger must be able to modify the variable for an `llvm.dbg.addr`? Any specific reason, or are we just trying to put limitations on the DIExpression in a `llvm.dbg.addr` intrinsic? bjope: Is it true that a debugger must be able to modify the variable for an `llvm.dbg.addr`? Any…
				vskAuthorUnsubmitted Not Done Reply Inline Actions No, I'll walk this back. It's valid to describe a read-only memory location. After thinking about it some more, I don't think there's really an issue with DW_OP_stack_value inside of a llvm.dbg.addr either. vsk: No, I'll walk this back. It's valid to describe a read-only memory location. After thinking…
				be able to modify the variable via this location. Consequently, this
				DIExpression must not contain ``DW_OP_stack_value``.

				The value operand of a ``llvm.dbg.value`` intrinsic may either be a concrete
				location or an implicit one. The following rules apply to a DIExpression
				attached to a ``llvm.dbg.value`` intrinsic:

				- If the value operand of the intrinsic is an implicit location, the
				rnkUnsubmitted Not Done Reply Inline Actions What makes value operands to dbg.value implicit or concrete in LLVM IR? Are SSA values from local instructions concrete, and constants implicit? We could describe that here. rnk: What makes value operands to dbg.value implicit or concrete in LLVM IR? Are SSA values from…
				vskAuthorUnsubmitted Not Done Reply Inline Actions Sure. The way I read it, it depends on the DIType of the described variable. The value operand is concrete iff it's is a pointer to an instance of that DIType. So, the value operand in dbg.value(const-ptr-null, "int p") is implicit, but concrete in dbg.value(const-ptr-null, "int"). At least, that's the only consistent explanation I've thought of. I don't know how the backend actually determines this. IIUC D49454/D49520 is an example of the backend getting this wrong: it treats a pointer to a std::deque as the implicit location of the std::deque. vsk:* Sure. The way I read it, it depends on the DIType of the described variable. The value operand…
				bjopeUnsubmitted Not Done Reply Inline Actions My interpretation (with very little experience of `llvm.dbg.addr`) has been that `llvm.dbg.addr` is the IR version of an indirect DBG_VALUE. And `llvm.dbg.value` is the IR version of an non-indirect DBG_VALUE. At least that seems to be the difference in SelectionDAG. Afaict the first argument in a `dbg.value`, together with the DIExpression, describes the value of the variable. The first argument in `dbg.addr`, together with the DIExpression, describes the address of the variable. And I think the first argument in `dbg.value` should be treated as a value, and the first argument in `dbg.addr` should be treated as an indirect pointer. A DIExpression might be used both in dbg.declare, dbg.addr, dbg.value, direct DBG_VALUE and indirect DBG_VALUE, and it could be both tricky and confusing how to interpret the DIExpression. Depending on which intrinsic that is used, or if the DBG_VALUE is direct/indirect, the DIExpression could have an implied DW_OP_stack_value, DW_OP_deref, at the end (or even at the front?). As it might be hard to understand this, improving the documentation is a really nice initiative! One question is if we need to be able to indicate that there is an indirect value operand in a `dbg.value`. Or isn't it enough that if you for example want to describe a variables !Y:s value as (X[0] + 5), then you need to include a DW_OP_deref such as dbg.value(X, !Y, DIExpression(DW_OP_deref, DW_OP_constu 5, DW_OP_add)) The above will become a direct DBG_VALUE since `dbg.value` is used. The DW_OP_deref is needed since by default the first argument in `dbg.value` is treated as a value and not a pointer. The variable will be described using an "implicit location" (DWARF terminology). Are you even saying that depending on !Y it might be wrong to have the DW_OP_deref here? Btw, I think it is confusing to use "concrete" as terminology for the value operand. Isn't the question if the value operand is direct or indirect (if it is a value or a pointer)? bjope: My interpretation (with very little experience of `llvm.dbg.addr`) has been that `llvm.dbg.
				vskAuthorUnsubmitted Not Done Reply Inline Actions My first response to @rnk here was incorrect: implicit vs. concrete is not the same distinction as direct vs. indirect. The latter is the relevant distinction and it has nothing to do with DIType. I consider @bjope's description here to be the "common sense" one we all thought was correct: interpreting a dbg.value should give a direct value, and interpreting a dbg.{addr,declare} should give an indirect value. I'll update this patch to make those definitions precise. Basically, there should be exactly one way to interpret a DIExpression, without any implicit DW_OP_stack_value or DW_OP_deref added based on the context of which intrinsic / what type of location you have. Once we land the fix in D49454 I think we'll either actually have that model or be really close. Right now there is some special magic with non-empty DIExpressions, but I hope to eliminate that. vsk: My first response to @rnk here was incorrect: implicit vs. concrete is not the same distinction…
				DIExpression is interpreted as if it contained ``DW_OP_stack_value``,
				regardless of what the DIExpression contains. The intrinsic describes an
				implicit location.

				- If the value operand of the intrinsic is a concrete location and the
				DIExpression does not contain ``DW_OP_stack_value``, the intrinsic describes
				a concrete location.

				- If the value operand of the intrinsic is a concrete location and the
				DIExpression contains ``DW_OP_stack_value``, the DIExpression is interpreted
				as if it started with a ``DW_OP_deref``, regardless of what else the
				DIExpression contains. The intrinsic describes an implicit location.

				.. note::

				In the future, ``llvm.dbg.value`` may only be allowed to have an implicit
				value operand. This should simplify DIExpression handling by eliminating two
				of the special cases described above.

	.. code-block:: text			.. code-block:: text

	!0 = !DIExpression(DW_OP_deref)			!0 = !DIExpression(DW_OP_deref)
	!1 = !DIExpression(DW_OP_plus_uconst, 3)			!1 = !DIExpression(DW_OP_plus_uconst, 3)
	!1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)			!1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)
	!2 = !DIExpression(DW_OP_bit_piece, 3, 7)			!2 = !DIExpression(DW_OP_bit_piece, 3, 7)
	!3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)			!3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
	▲ Show 20 Lines • Show All 9,992 Lines • Show Last 20 Lines

docs/SourceLevelDebugging.rst

	Show First 20 Lines • Show All 236 Lines • ▼ Show 20 Lines
	.. code-block:: llvm			.. code-block:: llvm

	void @llvm.dbg.value(metadata, metadata, metadata)			void @llvm.dbg.value(metadata, metadata, metadata)

	This intrinsic provides information when a user source variable is set to a new			This intrinsic provides information when a user source variable is set to a new
	value. The first argument is the new value (wrapped as metadata). The second			value. The first argument is the new value (wrapped as metadata). The second
	argument is a `local variable <LangRef.html#dilocalvariable>`_ containing a			argument is a `local variable <LangRef.html#dilocalvariable>`_ containing a
	description of the variable. The third argument is a `complex expression			description of the variable. The third argument is a `complex expression
	<LangRef.html#diexpression>`_.			<LangRef.html#diexpression>`_, with semantics which depend on whether the value
				operand is an implicit or a concrete location.

	Object lifetimes and scoping			Object lifetimes and scoping
	============================			============================

	In many languages, the local variables in functions can have their lifetimes or			In many languages, the local variables in functions can have their lifetimes or
	scopes limited to a subset of a function. In the C family of languages, for			scopes limited to a subset of a function. In the C family of languages, for
	example, variables are only live (readable and writable) within the source			example, variables are only live (readable and writable) within the source
	block that they are defined in. In functional languages, values are only			block that they are defined in. In functional languages, values are only
	▲ Show 20 Lines • Show All 1,390 Lines • Show Last 20 Lines