This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
14
LangRef.rst

Differential D47747

[LangRef] Clarify "undefined" for various instructions.
AbandonedPublic

Authored by efriedma on Jun 4 2018, 4:11 PM.

Download Raw Diff

Details

Reviewers

nlopes
• dberlin
hfinkel

Summary

Some places in LangRef say something like "the result is undefined"; instead, state what happens more explicitly. Some places don't say at all what happens when an invariant is violated; readers should assume the behavior is undefined, but clarify that in a few places related to memory accesses, where it might be confusing since some loads return undef or poison.

Not sure I've chosen the right resolution to all these cases; we might want to change some of these to return poison instead.

Diff Detail

Repository: rL LLVM

Event Timeline

efriedma created this revision.Jun 4 2018, 4:11 PM

fhahn mentioned this in D47475: [Local] Make DoesKMove required for combineMetadata..Jun 5 2018, 4:50 AM

nlopes added a subscriber: aqjune.Jun 5 2018, 7:35 AM

Eli, thanks a lot for kicking off the discussion. I think this patch is a bit too big since there are a few things that are not trivial.
For example, I would rather not introduce more functions returning undef, but rather return poison instead. If there's no good motivation for undef, poison should be used by default from now on, since it's much easier to handle than undef.
This patch also introduces a lot of UB with metadata tags, which is a departure from how we handle things like nsw/nuw which make the instructions yield poison instead of UB. Why is it more important to preserve nsw when hoisting an add than preserving !nonnull when hoisting a load? I really don't know; hence I'm asking.
I think it would help to split this patch a bit.

nlopes added inline comments.Jun 5 2018, 8:02 AM

docs/LangRef.rst
1051	I'm fine with all these UBs in function attributes, since it seems it's how they work today. However, we need to make sure that when hoisting function calls, which I don't think it's done today. This may not be feasible if the attribute is on the function decl rather than on the call instruction.
2365	I'm not sure this sentence is clear. Does it mean that for NaNs the result has to be the same before and after optimization? Or does it mean that the result after the optimization can be whatever, but fixed (no poison or undef allowed)?
3297	Can we have poison here instead? (and for the following ones as well)
4956	UB vs poison here: UB: after a load with 'range' we know the memory has a value within the range. Poison: after branching on the loaded value we know the memory as a value within the range. UB has more imediate effects than poison, of course. For GVN and friends, UB is a bit easier, but hoisting such an instruction we need to drop the range metadata. Is it done today? For range analysis, I guess both semantics are fine, though the UB semantics may potentially allow a better analysis since after the load the analysis can assume the range right away instead of waiting for a branch. The UB semantics helps shrinking bitwidth of arithmetic as well, which otherwise isn't easy to do (since you would need to check the users of the expression tree). Bottom line: this really depends on what kind of transformations LLVM does today (and in the future) that care about this range metadata.
7314	can it be poison instead?
7575	Is undef required by the C/C++ standards or can it be poison?
8163	GEP is difficult. I suggest we leave GEP inbounds/inrange discussion for a separate patch.

efriedma added inline comments.Jun 5 2018, 1:24 PM

docs/LangRef.rst
2365	Should be "whatever, but no poison/undef" (there's very little benefit and a lot of risk involved in opening up the possibility of undefined behavior from fast-math). But yes, should be called out explicitly.
3297	Making fptoui produce poison is probably fine. For uitofp, we should probably consider defining the overflow cases to match the IEEE754 convertFromInt (produce +-Inf). But that's not what LLVM currently does, so it would involve some code changes.
7314	Probably, yes; it's not expected, in the same way as an out-of-range shift.
7575	Actually, thinking about this a bit more, we might need a defined, non-null value. C and C++ both prohibit zero-size types (arrays can't be zero length, and structs have at least one member in C, and implicit padding in C++). But there's a GNU extension to allow zero-length arrays. That extension is usually only used as a replacement for flexible array members in pre-C99 code, but we allow such arrays anywhere. And given zero-length array variables are allowed, the natural way to lower it is a zero-byte alloca. This is what clang currently emits. Given such variables are allowed, we probably need to ensure `&x == &x`, and `&x != 0`, which implies a value that is not poison or undef.

a.elovikov added a subscriber: a.elovikov.Jun 5 2018, 1:38 PM

hfinkel added subscribers: rsmith, chandlerc.Jun 6 2018, 2:55 AM

hfinkel added inline comments.

docs/LangRef.rst
1131	Chatting offline with @chandlerc and @rsmith , we currently have a problem with the way we use dereferenceable in Clang. We add this attribute for references, but it's possible to free the underlying storage during the execution of the function. I'd prefer to fix this by saying that dereferenceable only guarantees its properties at function entry, as this lets us preserve some of the optimization benefit (where the existing semantics can be obtained, in large part, by combining dereferenceable with noalias or noalias metadata). We should discuss what we'd like to do here.

efriedma added inline comments.Jun 6 2018, 1:14 PM

docs/LangRef.rst
1131	"Duration of the function" reflects what's actually implemented at the moment. And it's generally useful even if we can't use it for C++ references. C structs which are passed/returned indirectly work this way; Rust references also work like this. If "dereferencable on entry to the function" would be useful for C++, I think it should be a separate attribute. Since this is apparently controversial, I'll split "dereferencable" it into it's own patch, so we can continue discussing it.

See https://reviews.llvm.org/D47851, https://reviews.llvm.org/D47854, https://reviews.llvm.org/D47807, https://reviews.llvm.org/D47859 for patches split off from this. Still need to split off patches for GEP, alloca, fast-math flags, the dereferenceable function attribute, and the other function attributes.

fhahn added inline comments.Jun 7 2018, 1:58 AM

docs/LangRef.rst
4956	AFAIK GVN and friends use the combineMetadata helper to set the metadata of a instruction K replacing an instruction J. Currently this function only sets metadata like !range and !nonnull if both instructions have them (for range it selects the most general range). With using UB here (and for nonnull) we should be able to preserve such metadata kinds in K if it dominates J, as in D47339 and D47475

fhahn mentioned this in D47339: [GVN,NewGVN] Keep nonnull if K does not move..Jun 7 2018, 2:03 AM

Fast math changes: https://reviews.llvm.org/D47963
Dereferenceable: https://reviews.llvm.org/D48239

function attributes: https://reviews.llvm.org/D49041
alloca: https://reviews.llvm.org/D49042

That should be everything (except the getelementptr change, which I don't really want to dive into), so closing.

Revision Contents

Path

Size

docs/

LangRef.rst

109 lines

Diff 149866

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,042 Lines • ▼ Show 20 Lines	``inalloca``
used in conjunction with other attributes that affect argument		used in conjunction with other attributes that affect argument
storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The		storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
``inalloca`` attribute also disables LLVM's implicit lowering of		``inalloca`` attribute also disables LLVM's implicit lowering of
large aggregate return values, which means that frontend authors		large aggregate return values, which means that frontend authors
must lower them with ``sret`` pointers.		must lower them with ``sret`` pointers.

When the call site is reached, the argument allocation must have		When the call site is reached, the argument allocation must have
been the most recent stack allocation that is still live, or the		been the most recent stack allocation that is still live, or the
results are undefined. It is possible to allocate additional stack		behavior is undefined. It is possible to allocate additional stack
		nlopesUnsubmitted Not Done Reply Inline Actions I'm fine with all these UBs in function attributes, since it seems it's how they work today. However, we need to make sure that when hoisting function calls, which I don't think it's done today. This may not be feasible if the attribute is on the function decl rather than on the call instruction. nlopes: I'm fine with all these UBs in function attributes, since it seems it's how they work today.
space after an argument allocation and before its call site, but it		space after an argument allocation and before its call site, but it
must be cleared off with :ref:`llvm.stackrestore		must be cleared off with :ref:`llvm.stackrestore
<int_stackrestore>`.		<int_stackrestore>`.

See :doc:`InAlloca` for more information on how to use this		See :doc:`InAlloca` for more information on how to use this
attribute.		attribute.

``sret``		``sret``
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	``returned``
checked or enforced when generating the callee. The parameter and the		checked or enforced when generating the callee. The parameter and the
function return type must be valid operands for the		function return type must be valid operands for the
:ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for		:ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
return values and can only be applied to one parameter.		return values and can only be applied to one parameter.

``nonnull``		``nonnull``
This indicates that the parameter or return pointer is not null. This		This indicates that the parameter or return pointer is not null. This
attribute may only be applied to pointer typed parameters. This is not		attribute may only be applied to pointer typed parameters. This is not
checked or enforced by LLVM, the caller must ensure that the pointer		checked or enforced by LLVM; if the parameter or return pointer is null,
passed in is non-null, or the callee must ensure that the returned pointer		the behavior is undefined.
is non-null.

``dereferenceable(<n>)``		``dereferenceable(<n>)``
This indicates that the parameter or return pointer is dereferenceable. This		This indicates that the parameter or return pointer is dereferenceable for
		the duration of the function. If the pointer cannot be dereferenced at any
		point in the function, the behavior is undefined. This
		hfinkelUnsubmitted Not Done Reply Inline Actions Chatting offline with @chandlerc and @rsmith , we currently have a problem with the way we use dereferenceable in Clang. We add this attribute for references, but it's possible to free the underlying storage during the execution of the function. I'd prefer to fix this by saying that dereferenceable only guarantees its properties at function entry, as this lets us preserve some of the optimization benefit (where the existing semantics can be obtained, in large part, by combining dereferenceable with noalias or noalias metadata). We should discuss what we'd like to do here. hfinkel: Chatting offline with @chandlerc and @rsmith , we currently have a problem with the way we use…
		efriedmaAuthorUnsubmitted Not Done Reply Inline Actions "Duration of the function" reflects what's actually implemented at the moment. And it's generally useful even if we can't use it for C++ references. C structs which are passed/returned indirectly work this way; Rust references also work like this. If "dereferencable on entry to the function" would be useful for C++, I think it should be a separate attribute. Since this is apparently controversial, I'll split "dereferencable" it into it's own patch, so we can continue discussing it. efriedma: "Duration of the function" reflects what's actually implemented at the moment. And it's…
attribute may only be applied to pointer typed parameters. A pointer that		attribute may only be applied to pointer typed parameters. A pointer that
is dereferenceable can be loaded from speculatively without a risk of		is dereferenceable can be loaded from speculatively without a risk of
trapping. The number of bytes known to be dereferenceable must be provided		trapping. The number of bytes known to be dereferenceable must be provided
in parentheses. It is legal for the number of bytes to be less than the		in parentheses. It is legal for the number of bytes to be less than the
size of the pointee type. The ``nonnull`` attribute does not imply		size of the pointee type. The ``nonnull`` attribute does not imply
dereferenceability (consider a pointer to one element past the end of an		dereferenceability (consider a pointer to one element past the end of an
array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in		array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
``addrspace(0)`` (which is the default address space).		``addrspace(0)`` (which is the default address space).
▲ Show 20 Lines • Show All 243 Lines • ▼ Show 20 Lines	``convergent``

The optimizer may remove the ``convergent`` attribute on functions when it		The optimizer may remove the ``convergent`` attribute on functions when it
can prove that the function does not execute any convergent operations.		can prove that the function does not execute any convergent operations.
Similarly, the optimizer may remove ``convergent`` on calls/invokes when it		Similarly, the optimizer may remove ``convergent`` on calls/invokes when it
can prove that the call/invoke cannot call a convergent function.		can prove that the call/invoke cannot call a convergent function.
``inaccessiblememonly``		``inaccessiblememonly``
This attribute indicates that the function may only access memory that		This attribute indicates that the function may only access memory that
is not accessible by the module being compiled. This is a weaker form		is not accessible by the module being compiled. This is a weaker form
of ``readnone``.		of ``readnone``. If the function reads or writes other memory, the
		behavior is undefined.
``inaccessiblemem_or_argmemonly``		``inaccessiblemem_or_argmemonly``
This attribute indicates that the function may only access memory that is		This attribute indicates that the function may only access memory that is
either not accessible by the module being compiled, or is pointed to		either not accessible by the module being compiled, or is pointed to
by its pointer arguments. This is a weaker form of ``argmemonly``		by its pointer arguments. This is a weaker form of ``argmemonly``. If the
		function reads or writes other memory, the behavior is undefined.
``inlinehint``		``inlinehint``
This attribute indicates that the source code contained a hint that		This attribute indicates that the source code contained a hint that
inlining this function is desirable (such as the "inline" keyword in		inlining this function is desirable (such as the "inline" keyword in
C/C++). It is just a hint; it imposes no requirements on the		C/C++). It is just a hint; it imposes no requirements on the
inliner.		inliner.
``jumptable``		``jumptable``
This attribute indicates that the function should be added to a		This attribute indicates that the function should be added to a
jump-instruction table at code-generation time, and that all address-taken		jump-instruction table at code-generation time, and that all address-taken
▲ Show 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	``readnone``
to callers. This means while it cannot unwind exceptions by calling		to callers. This means while it cannot unwind exceptions by calling
the ``C++`` exception throwing methods (since they write to memory), there may		the ``C++`` exception throwing methods (since they write to memory), there may
be non-``C++`` mechanisms that throw exceptions without writing to LLVM		be non-``C++`` mechanisms that throw exceptions without writing to LLVM
visible memory.		visible memory.

On an argument, this attribute indicates that the function does not		On an argument, this attribute indicates that the function does not
dereference that pointer argument, even though it may read or write the		dereference that pointer argument, even though it may read or write the
memory that the pointer points to if accessed through other pointers.		memory that the pointer points to if accessed through other pointers.

		If a readnone function reads or writes memory visible to the program, or
		has other side-effects, the behavior is undefined. If a function reads from
		or writes to a readnone pointer argument, the behavior is undefined.
``readonly``		``readonly``
On a function, this attribute indicates that the function does not write		On a function, this attribute indicates that the function does not write
through any pointer arguments (including ``byval`` arguments) or otherwise		through any pointer arguments (including ``byval`` arguments) or otherwise
modify any state (e.g. memory, control registers, etc) visible to		modify any state (e.g. memory, control registers, etc) visible to
caller functions. It may dereference pointer arguments and read		caller functions. It may dereference pointer arguments and read
state that may be set in the caller. A readonly function always		state that may be set in the caller. A readonly function always
returns the same value (or unwinds an exception identically) when		returns the same value (or unwinds an exception identically) when
called with the same set of arguments and global state. This means while it		called with the same set of arguments and global state. This means while it
cannot unwind exceptions by calling the ``C++`` exception throwing methods		cannot unwind exceptions by calling the ``C++`` exception throwing methods
(since they write to memory), there may be non-``C++`` mechanisms that throw		(since they write to memory), there may be non-``C++`` mechanisms that throw
exceptions without writing to LLVM visible memory.		exceptions without writing to LLVM visible memory.

On an argument, this attribute indicates that the function does not write		On an argument, this attribute indicates that the function does not write
through this pointer argument, even though it may write to the memory that		through this pointer argument, even though it may write to the memory that
the pointer points to.		the pointer points to.

		If a readonly function writes memory visible to the program, or
		has other side-effects, the behavior is undefined. If a function writes to
		a readonly pointer argument, the behavior is undefined.
``"stack-probe-size"``		``"stack-probe-size"``
This attribute controls the behavior of stack probes: either		This attribute controls the behavior of stack probes: either
the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.		the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
It defines the size of the guard region. It ensures that if the function		It defines the size of the guard region. It ensures that if the function
may use more stack space than the size of the guard region, stack probing		may use more stack space than the size of the guard region, stack probing
sequence will be emitted. It takes one required integer value, which		sequence will be emitted. It takes one required integer value, which
is 4096 by default.		is 4096 by default.

If a function that has a ``"stack-probe-size"`` attribute is inlined into		If a function that has a ``"stack-probe-size"`` attribute is inlined into
a function with another ``"stack-probe-size"`` attribute, the resulting		a function with another ``"stack-probe-size"`` attribute, the resulting
function has the ``"stack-probe-size"`` attribute that has the lower		function has the ``"stack-probe-size"`` attribute that has the lower
numeric value. If a function that has a ``"stack-probe-size"`` attribute is		numeric value. If a function that has a ``"stack-probe-size"`` attribute is
inlined into a function that has no ``"stack-probe-size"`` attribute		inlined into a function that has no ``"stack-probe-size"`` attribute
at all, the resulting function has the ``"stack-probe-size"`` attribute		at all, the resulting function has the ``"stack-probe-size"`` attribute
of the callee.		of the callee.
``"no-stack-arg-probe"``		``"no-stack-arg-probe"``
This attribute disables ABI-required stack probes, if any.		This attribute disables ABI-required stack probes, if any.
``writeonly``		``writeonly``
On a function, this attribute indicates that the function may write to but		On a function, this attribute indicates that the function may write to but
does not read from memory.		does not read from memory.

On an argument, this attribute indicates that the function may write to but		On an argument, this attribute indicates that the function may write to but
does not read through this pointer argument (even though it may read from		does not read through this pointer argument (even though it may read from
the memory that the pointer points to).		the memory that the pointer points to).

		If a readonly function reads memory visible to the program, or
		has other side-effects, the behavior is undefined. If a function uses a
		value read from a writeonly pointer argument, the behavior is undefined.
``argmemonly``		``argmemonly``
This attribute indicates that the only memory accesses inside function are		This attribute indicates that the only memory accesses inside function are
loads and stores from objects pointed to by its pointer-typed arguments,		loads and stores from objects pointed to by its pointer-typed arguments,
with arbitrary offsets. Or in other words, all memory operations in the		with arbitrary offsets. Or in other words, all memory operations in the
function can refer to memory only using pointers based on its function		function can refer to memory only using pointers based on its function
arguments.		arguments.

Note that ``argmemonly`` can be used together with ``readonly`` attribute		Note that ``argmemonly`` can be used together with ``readonly`` attribute
in order to specify that function reads only from its arguments.		in order to specify that function reads only from its arguments.

		If an argmemonly function reads or writes memory other than the pointer
		arguments, or has other side-effects, the behavior is undefined.
``returns_twice``		``returns_twice``
This attribute indicates that this function can return twice. The C		This attribute indicates that this function can return twice. The C
``setjmp`` is an example of such a function. The compiler disables		``setjmp`` is an example of such a function. The compiler disables
some optimizations (like tail calls) in the caller of these		some optimizations (like tail calls) in the caller of these
functions.		functions.
``safestack``		``safestack``
This attribute indicates that		This attribute indicates that
`SafeStack <http://clang.llvm.org/docs/SafeStack.html>`_		`SafeStack <http://clang.llvm.org/docs/SafeStack.html>`_
▲ Show 20 Lines • Show All 746 Lines • ▼ Show 20 Lines
:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,		:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`) and :ref:`call <i_call>`		:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`) and :ref:`call <i_call>`
may use the following flags to enable otherwise unsafe		may use the following flags to enable otherwise unsafe
floating-point transformations.		floating-point transformations.

``nnan``		``nnan``
No NaNs - Allow optimizations to assume the arguments and result are not		No NaNs - Allow optimizations to assume the arguments and result are not
NaN. Such optimizations are required to retain defined behavior over		NaN. Such optimizations are required to retain defined behavior over
NaNs, but the value of the result is undefined.		NaNs, but the value of the result is unspecified.
		nlopesUnsubmitted Not Done Reply Inline Actions I'm not sure this sentence is clear. Does it mean that for NaNs the result has to be the same before and after optimization? Or does it mean that the result after the optimization can be whatever, but fixed (no poison or undef allowed)? nlopes: I'm not sure this sentence is clear. Does it mean that for NaNs the result has to be the same…
		efriedmaAuthorUnsubmitted Not Done Reply Inline Actions Should be "whatever, but no poison/undef" (there's very little benefit and a lot of risk involved in opening up the possibility of undefined behavior from fast-math). But yes, should be called out explicitly. efriedma: Should be "whatever, but no poison/undef" (there's very little benefit and a lot of risk…

``ninf``		``ninf``
No Infs - Allow optimizations to assume the arguments and result are not		No Infs - Allow optimizations to assume the arguments and result are not
+/-Inf. Such optimizations are required to retain defined behavior over		+/-Inf. Such optimizations are required to retain defined behavior over
+/-Inf, but the value of the result is undefined.		+/-Inf, but the value of the result is unspecified.

``nsz``		``nsz``
No Signed Zeros - Allow optimizations to treat the sign of a zero		No Signed Zeros - Allow optimizations to treat the sign of a zero
argument or result as insignificant.		argument or result as insignificant.

``arcp``		``arcp``
Allow Reciprocal - Allow optimizations to use the reciprocal of an		Allow Reciprocal - Allow optimizations to use the reciprocal of an
argument rather than perform division.		argument rather than perform division.
▲ Show 20 Lines • Show All 910 Lines • ▼ Show 20 Lines	``fpext (CST to TYPE)``
Floating-point extend a constant to another type. The size of CST		Floating-point extend a constant to another type. The size of CST
must be smaller or equal to the size of TYPE. Both types must be		must be smaller or equal to the size of TYPE. Both types must be
floating-point.		floating-point.
``fptoui (CST to TYPE)``		``fptoui (CST to TYPE)``
Convert a floating-point constant to the corresponding unsigned		Convert a floating-point constant to the corresponding unsigned
integer constant. TYPE must be a scalar or vector integer type. CST		integer constant. TYPE must be a scalar or vector integer type. CST
must be of scalar or vector floating-point type. Both CST and TYPE		must be of scalar or vector floating-point type. Both CST and TYPE
must be scalars, or vectors of the same number of elements. If the		must be scalars, or vectors of the same number of elements. If the
value won't fit in the integer type, the results are undefined.		value won't fit in the integer type, the result is ``undef``.
		nlopesUnsubmitted Not Done Reply Inline Actions Can we have poison here instead? (and for the following ones as well) nlopes: Can we have poison here instead? (and for the following ones as well)
		efriedmaAuthorUnsubmitted Not Done Reply Inline Actions Making fptoui produce poison is probably fine. For uitofp, we should probably consider defining the overflow cases to match the IEEE754 convertFromInt (produce +-Inf). But that's not what LLVM currently does, so it would involve some code changes. efriedma: Making fptoui produce poison is probably fine. For uitofp, we should probably consider…
``fptosi (CST to TYPE)``		``fptosi (CST to TYPE)``
Convert a floating-point constant to the corresponding signed		Convert a floating-point constant to the corresponding signed
integer constant. TYPE must be a scalar or vector integer type. CST		integer constant. TYPE must be a scalar or vector integer type. CST
must be of scalar or vector floating-point type. Both CST and TYPE		must be of scalar or vector floating-point type. Both CST and TYPE
must be scalars, or vectors of the same number of elements. If the		must be scalars, or vectors of the same number of elements. If the
value won't fit in the integer type, the results are undefined.		value won't fit in the integer type, the result is ``undef``.
``uitofp (CST to TYPE)``		``uitofp (CST to TYPE)``
Convert an unsigned integer constant to the corresponding		Convert an unsigned integer constant to the corresponding
floating-point constant. TYPE must be a scalar or vector floating-point		floating-point constant. TYPE must be a scalar or vector floating-point
type. CST must be of scalar or vector integer type. Both CST and TYPE must		type. CST must be of scalar or vector integer type. Both CST and TYPE must
be scalars, or vectors of the same number of elements. If the value		be scalars, or vectors of the same number of elements. If the value
won't fit in the floating-point type, the results are undefined.		won't fit in the floating-point type, the result is ``undef``.
``sitofp (CST to TYPE)``		``sitofp (CST to TYPE)``
Convert a signed integer constant to the corresponding floating-point		Convert a signed integer constant to the corresponding floating-point
constant. TYPE must be a scalar or vector floating-point type.		constant. TYPE must be a scalar or vector floating-point type.
CST must be of scalar or vector integer type. Both CST and TYPE must		CST must be of scalar or vector integer type. Both CST and TYPE must
be scalars, or vectors of the same number of elements. If the value		be scalars, or vectors of the same number of elements. If the value
won't fit in the floating-point type, the results are undefined.		won't fit in the floating-point type, the result is ``undef``.
``ptrtoint (CST to TYPE)``		``ptrtoint (CST to TYPE)``
Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.		Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
``inttoptr (CST to TYPE)``		``inttoptr (CST to TYPE)``
Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.		Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
This one is really dangerous!		This one is really dangerous!
``bitcast (CST to TYPE)``		``bitcast (CST to TYPE)``
Convert a constant, CST, to another TYPE.		Convert a constant, CST, to another TYPE.
The constraints of the operands are the same as those for the		The constraints of the operands are the same as those for the
▲ Show 20 Lines • Show All 1,623 Lines • ▼ Show 20 Lines

.. _range-metadata:		.. _range-metadata:

'``range``' Metadata		'``range``' Metadata
^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^

``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of		``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
integer types. It expresses the possible ranges the loaded value or the value		integer types. It expresses the possible ranges the loaded value or the value
returned by the called function at this call site is in. The ranges are		returned by the called function at this call site is in. If the loaded or
represented with a flattened list of integers. The loaded value or the value		returned value is not in the specified range, the behavior is undefined. The
		nlopesUnsubmitted Not Done Reply Inline Actions UB vs poison here: UB: after a load with 'range' we know the memory has a value within the range. Poison: after branching on the loaded value we know the memory as a value within the range. UB has more imediate effects than poison, of course. For GVN and friends, UB is a bit easier, but hoisting such an instruction we need to drop the range metadata. Is it done today? For range analysis, I guess both semantics are fine, though the UB semantics may potentially allow a better analysis since after the load the analysis can assume the range right away instead of waiting for a branch. The UB semantics helps shrinking bitwidth of arithmetic as well, which otherwise isn't easy to do (since you would need to check the users of the expression tree). Bottom line: this really depends on what kind of transformations LLVM does today (and in the future) that care about this range metadata. nlopes: UB vs poison here: - UB: after a load with 'range' we know the memory has a value within the…
		fhahnUnsubmitted Not Done Reply Inline Actions AFAIK GVN and friends use the combineMetadata helper to set the metadata of a instruction K replacing an instruction J. Currently this function only sets metadata like !range and !nonnull if both instructions have them (for range it selects the most general range). With using UB here (and for nonnull) we should be able to preserve such metadata kinds in K if it dominates J, as in D47339 and D47475 fhahn: AFAIK GVN and friends use the combineMetadata helper to set the metadata of a instruction K…
returned is known to be in the union of the ranges defined by each consecutive		ranges are represented with a flattened list of integers. The loaded value or
pair. Each pair has the following properties:		the value returned is known to be in the union of the ranges defined by each
		consecutive pair. Each pair has the following properties:

- The type must match the type loaded by the instruction.		- The type must match the type loaded by the instruction.
- The pair ``a,b`` represents the range ``[a,b)``.		- The pair ``a,b`` represents the range ``[a,b)``.
- Both ``a`` and ``b`` are constants.		- Both ``a`` and ``b`` are constants.
- The range is allowed to wrap.		- The range is allowed to wrap.
- The range should not represent the full or empty set. That is,		- The range should not represent the full or empty set. That is,
``a!=b``.		``a!=b``.

▲ Show 20 Lines • Show All 2,338 Lines • ▼ Show 20 Lines
the position from which to extract the element. The index may be a		the position from which to extract the element. The index may be a
variable of any integer type.		variable of any integer type.

Semantics:		Semantics:
""""""""""		""""""""""

The result is a scalar of the same type as the element type of ``val``.		The result is a scalar of the same type as the element type of ``val``.
Its value is the value at position ``idx`` of ``val``. If ``idx``		Its value is the value at position ``idx`` of ``val``. If ``idx``
exceeds the length of ``val``, the results are undefined.		exceeds the length of ``val``, the result is ``undef``.
		nlopesUnsubmitted Not Done Reply Inline Actions can it be poison instead? nlopes: can it be poison instead?
		efriedmaAuthorUnsubmitted Not Done Reply Inline Actions Probably, yes; it's not expected, in the same way as an out-of-range shift. efriedma: Probably, yes; it's not expected, in the same way as an out-of-range shift.

Example:		Example:
""""""""		""""""""

.. code-block:: text		.. code-block:: text

<result> = extractelement <4 x i32> %vec, i32 0 ; yields i32		<result> = extractelement <4 x i32> %vec, i32 0 ; yields i32

Show All 24 Lines
is an index indicating the position at which to insert the value. The		is an index indicating the position at which to insert the value. The
index may be a variable of any integer type.		index may be a variable of any integer type.

Semantics:		Semantics:
""""""""""		""""""""""

The result is a vector of the same type as ``val``. Its element values		The result is a vector of the same type as ``val``. Its element values
are those of ``val`` except at position ``idx``, where it gets the value		are those of ``val`` except at position ``idx``, where it gets the value
``elt``. If ``idx`` exceeds the length of ``val``, the results are		``elt``. If ``idx`` exceeds the length of ``val``, the result is
undefined.		``undef``.

Example:		Example:
""""""""		""""""""

.. code-block:: text		.. code-block:: text

<result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>		<result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>

▲ Show 20 Lines • Show All 193 Lines • ▼ Show 20 Lines
zero, the target can choose to align the allocation on any convenient		zero, the target can choose to align the allocation on any convenient
boundary compatible with the type.		boundary compatible with the type.

'``type``' may be any sized type.		'``type``' may be any sized type.

Semantics:		Semantics:
""""""""""		""""""""""

Memory is allocated; a pointer is returned. The operation is undefined		Memory is allocated; a pointer is returned. The behavior is undefined
if there is insufficient stack space for the allocation. '``alloca``'d		if there is insufficient stack space for the allocation. (Some targets
		provide stronger guarantees for what happens on a stack overflow; see
		also the ``"probe-stack"`` attribute.) '``alloca``'d
memory is automatically released when the function returns. The		memory is automatically released when the function returns. The
'``alloca``' instruction is commonly used to represent automatic		'``alloca``' instruction is commonly used to represent automatic
variables that must have an address available. When the function returns		variables that must have an address available. When the function returns
(either with the ``ret`` or ``resume`` instructions), the memory is		(either with the ``ret`` or ``resume`` instructions), the memory is
reclaimed. Allocating zero bytes is legal, but the result is undefined.		reclaimed. Allocating zero bytes is legal, but the returned pointer is
The order in which memory is allocated (ie., which way the stack grows)		is ``undef``. The order in which memory is allocated (ie., which way the
		nlopesUnsubmitted Not Done Reply Inline Actions Is undef required by the C/C++ standards or can it be poison? nlopes: Is undef required by the C/C++ standards or can it be poison?
		efriedmaAuthorUnsubmitted Not Done Reply Inline Actions Actually, thinking about this a bit more, we might need a defined, non-null value. C and C++ both prohibit zero-size types (arrays can't be zero length, and structs have at least one member in C, and implicit padding in C++). But there's a GNU extension to allow zero-length arrays. That extension is usually only used as a replacement for flexible array members in pre-C99 code, but we allow such arrays anywhere. And given zero-length array variables are allowed, the natural way to lower it is a zero-byte alloca. This is what clang currently emits. Given such variables are allowed, we probably need to ensure `&x == &x`, and `&x != 0`, which implies a value that is not poison or undef. efriedma: Actually, thinking about this a bit more, we might need a defined, non-null value. C and C++…
is not specified.		stack grows) is not specified.

Example:		Example:
""""""""		""""""""

.. code-block:: llvm		.. code-block:: llvm

%ptr = alloca i32 ; yields i32*:ptr		%ptr = alloca i32 ; yields i32*:ptr
%ptr = alloca i32, i32 4 ; yields i32*:ptr		%ptr = alloca i32, i32 4 ; yields i32*:ptr
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
generator may select special instructions to save cache bandwidth, such		generator may select special instructions to save cache bandwidth, such
as the ``MOVNT`` instruction on x86.		as the ``MOVNT`` instruction on x86.

The optional ``!invariant.load`` metadata must reference a single		The optional ``!invariant.load`` metadata must reference a single
metadata name ``<index>`` corresponding to a metadata node with no		metadata name ``<index>`` corresponding to a metadata node with no
entries. If a load instruction tagged with the ``!invariant.load``		entries. If a load instruction tagged with the ``!invariant.load``
metadata is executed, the optimizer may assume the memory location		metadata is executed, the optimizer may assume the memory location
referenced by the load contains the same value at all points in the		referenced by the load contains the same value at all points in the
program where the memory location is known to be dereferenceable.		program where the memory location is known to be dereferenceable;
		otherwise, the behavior is undefined.

The optional ``!invariant.group`` metadata must reference a single metadata name		The optional ``!invariant.group`` metadata must reference a single metadata name
``<index>`` corresponding to a metadata node with no entries.		``<index>`` corresponding to a metadata node with no entries.
See ``invariant.group`` metadata.		See ``invariant.group`` metadata.

The optional ``!nonnull`` metadata must reference a single		The optional ``!nonnull`` metadata must reference a single
metadata name ``<index>`` corresponding to a metadata node with no		metadata name ``<index>`` corresponding to a metadata node with no
entries. The existence of the ``!nonnull`` metadata on the		entries. The existence of the ``!nonnull`` metadata on the
instruction tells the optimizer that the value loaded is known to		instruction tells the optimizer that the value loaded is known to
never be null. This is analogous to the ``nonnull`` attribute		never be null. If the value is null, the behavior is undefined. This is
on parameters and return values. This metadata can only be applied		analogous to the ``nonnull`` attribute on parameters and return values.
to loads of a pointer type.		This metadata can only be applied to loads of a pointer type.

The optional ``!dereferenceable`` metadata must reference a single metadata		The optional ``!dereferenceable`` metadata must reference a single metadata
name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``		name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
entry. The existence of the ``!dereferenceable`` metadata on the instruction		entry. The existence of the ``!dereferenceable`` metadata on the instruction
tells the optimizer that the value loaded is known to be dereferenceable.		tells the optimizer that the value loaded is known to be dereferenceable at
		any later point in the program.
The number of bytes known to be dereferenceable is specified by the integer		The number of bytes known to be dereferenceable is specified by the integer
value in the metadata node. This is analogous to the ''dereferenceable''		value in the metadata node. This is analogous to the ''dereferenceable''
attribute on parameters and return values. This metadata can only be applied		attribute on parameters and return values. This metadata can only be applied
to loads of a pointer type.		to loads of a pointer type.

The optional ``!dereferenceable_or_null`` metadata must reference a single		The optional ``!dereferenceable_or_null`` metadata must reference a single
metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one		metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
``i64`` entry. The existence of the ``!dereferenceable_or_null`` metadata on the		``i64`` entry. The existence of the ``!dereferenceable_or_null`` metadata on the
instruction tells the optimizer that the value loaded is known to be either		instruction tells the optimizer that the value loaded is known to be either
dereferenceable or null.		dereferenceable or null.
The number of bytes known to be dereferenceable is specified by the integer		The number of bytes known to be dereferenceable is specified by the integer
value in the metadata node. This is analogous to the ''dereferenceable_or_null''		value in the metadata node. This is analogous to the ''dereferenceable_or_null''
attribute on parameters and return values. This metadata can only be applied		attribute on parameters and return values. This metadata can only be applied
to loads of a pointer type.		to loads of a pointer type.

The optional ``!align`` metadata must reference a single metadata name		The optional ``!align`` metadata must reference a single metadata name
``<align_node>`` corresponding to a metadata node with one ``i64`` entry.		``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
The existence of the ``!align`` metadata on the instruction tells the		The existence of the ``!align`` metadata on the instruction tells the
optimizer that the value loaded is known to be aligned to a boundary specified		optimizer that the value loaded is known to be aligned to a boundary specified
by the integer value in the metadata node. The alignment must be a power of 2.		by the integer value in the metadata node. The alignment must be a power of 2.
This is analogous to the ''align'' attribute on parameters and return values.		This is analogous to the ''align'' attribute on parameters and return values.
This metadata can only be applied to loads of a pointer type.		This metadata can only be applied to loads of a pointer type. If the returned
		value is not appropriately aligned, the behavior is undefined.

Semantics:		Semantics:
""""""""""		""""""""""

The location of memory pointed to is loaded. If the value being loaded		The location of memory pointed to is loaded. If the value being loaded
is of scalar type then the number of bytes read does not exceed the		is of scalar type then the number of bytes read does not exceed the
minimum number of bytes needed to hold all bits of the type. For		minimum number of bytes needed to hold all bits of the type. For
example, loading an ``i24`` reads at most three bytes. When loading a		example, loading an ``i24`` reads at most three bytes. When loading a
▲ Show 20 Lines • Show All 447 Lines • ▼ Show 20 Lines
information.		information.

If the ``inrange`` keyword is present before any index, loading from or		If the ``inrange`` keyword is present before any index, loading from or
storing to any pointer derived from the ``getelementptr`` has undefined		storing to any pointer derived from the ``getelementptr`` has undefined
behavior if the load or store would access memory outside of the bounds of		behavior if the load or store would access memory outside of the bounds of
the element selected by the index marked as ``inrange``. The result of a		the element selected by the index marked as ``inrange``. The result of a
pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations		pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
involving memory) involving a pointer derived from a ``getelementptr`` with		involving memory) involving a pointer derived from a ``getelementptr`` with
the ``inrange`` keyword is undefined, with the exception of comparisons		the ``inrange`` keyword is ``undef``, with the exception of comparisons
		nlopesUnsubmitted Not Done Reply Inline Actions GEP is difficult. I suggest we leave GEP inbounds/inrange discussion for a separate patch. nlopes: GEP is difficult. I suggest we leave GEP inbounds/inrange discussion for a separate patch.
in the case where both operands are in the range of the element selected		in the case where both operands are in the range of the element selected
by the ``inrange`` keyword, inclusive of the address one past the end of		by the ``inrange`` keyword, inclusive of the address one past the end of
that element. Note that the ``inrange`` keyword is currently only allowed		that element. Note that the ``inrange`` keyword is currently only allowed
in constant ``getelementptr`` expressions.		in constant ``getelementptr`` expressions.

The getelementptr instruction is often confusing. For some more insight		The getelementptr instruction is often confusing. For some more insight
into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.		into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.

▲ Show 20 Lines • Show All 308 Lines • ▼ Show 20 Lines
``ty`` is a vector floating-point type, ``ty2`` must be a vector integer		``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
type with the same number of elements as ``ty``		type with the same number of elements as ``ty``

Semantics:		Semantics:
""""""""""		""""""""""

The '``fptoui``' instruction converts its :ref:`floating-point		The '``fptoui``' instruction converts its :ref:`floating-point
<t_floating>` operand into the nearest (rounding towards zero)		<t_floating>` operand into the nearest (rounding towards zero)
unsigned integer value. If the value cannot fit in ``ty2``, the results		unsigned integer value. If the value cannot fit in ``ty2``, the result
are undefined.		is ``undef``.

Example:		Example:
""""""""		""""""""

.. code-block:: llvm		.. code-block:: llvm

%X = fptoui double 123.0 to i32 ; yields i32:123		%X = fptoui double 123.0 to i32 ; yields i32:123
%Y = fptoui float 1.0E+300 to i1 ; yields undefined:1		%Y = fptoui float 1.0E+300 to i1 ; yields undefined:1
Show All 24 Lines
``ty`` is a vector floating-point type, ``ty2`` must be a vector integer		``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
type with the same number of elements as ``ty``		type with the same number of elements as ``ty``

Semantics:		Semantics:
""""""""""		""""""""""

The '``fptosi``' instruction converts its :ref:`floating-point		The '``fptosi``' instruction converts its :ref:`floating-point
<t_floating>` operand into the nearest (rounding towards zero)		<t_floating>` operand into the nearest (rounding towards zero)
signed integer value. If the value cannot fit in ``ty2``, the results		signed integer value. If the value cannot fit in ``ty2``, the result
are undefined.		is ``undef``.

Example:		Example:
""""""""		""""""""

.. code-block:: llvm		.. code-block:: llvm

%X = fptosi double -123.0 to i32 ; yields i32:-123		%X = fptosi double -123.0 to i32 ; yields i32:-123
%Y = fptosi float 1.0E-247 to i1 ; yields undefined:1		%Y = fptosi float 1.0E-247 to i1 ; yields undefined:1
Show All 24 Lines
``ty`` is a vector integer type, ``ty2`` must be a vector floating-point		``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
type with the same number of elements as ``ty``		type with the same number of elements as ``ty``

Semantics:		Semantics:
""""""""""		""""""""""

The '``uitofp``' instruction interprets its operand as an unsigned		The '``uitofp``' instruction interprets its operand as an unsigned
integer quantity and converts it to the corresponding floating-point		integer quantity and converts it to the corresponding floating-point
value. If the value cannot fit in the floating-point value, the results		value. The conversion uses round-to-nearest rounding if the input cannot
are undefined.		be represented exactly. If the input cannot fit in the floating-point
		value, the result is ``undef``.

Example:		Example:
""""""""		""""""""

.. code-block:: llvm		.. code-block:: llvm

%X = uitofp i32 257 to float ; yields float:257.0		%X = uitofp i32 257 to float ; yields float:257.0
%Y = uitofp i8 -1 to double ; yields double:255.0		%Y = uitofp i8 -1 to double ; yields double:255.0
Show All 22 Lines
``ty2``, which must be an :ref:`floating-point <t_floating>` type. If		``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
``ty`` is a vector integer type, ``ty2`` must be a vector floating-point		``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
type with the same number of elements as ``ty``		type with the same number of elements as ``ty``

Semantics:		Semantics:
""""""""""		""""""""""

The '``sitofp``' instruction interprets its operand as a signed integer		The '``sitofp``' instruction interprets its operand as a signed integer
quantity and converts it to the corresponding floating-point value. If		quantity and converts it to the corresponding floating-point value. The
the value cannot fit in the floating-point value, the results are		conversion uses round-to-nearest rounding if the input cannot be represented
undefined.		exactly. If the input cannot fit in the floating-point value, the result is
		``undef``.

Example:		Example:
""""""""		""""""""

.. code-block:: llvm		.. code-block:: llvm

%X = sitofp i32 257 to float ; yields float:257.0		%X = sitofp i32 257 to float ; yields float:257.0
%Y = sitofp i8 -1 to double ; yields double:-1.0		%Y = sitofp i8 -1 to double ; yields double:-1.0
▲ Show 20 Lines • Show All 6,039 Lines • Show Last 20 Lines