Download Raw Diff

Details

Reviewers

efriedma
lebedev.ri
samparker

Commits

rG880c35a55495: [HardwareLoops] LangRef Intrinsic descriptions

Summary

As pointed out by @lebedev.ri in D79100, intrinsic descriptions of the HardwareLoop intrinsics were missing in LangRef. This adds these descriptions.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

SjoerdMeijer created this revision.May 20 2020, 11:20 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2020, 11:20 AM

Herald added subscribers: jdoerfert, hiraditya. · View Herald Transcript

simoll added a subscriber: simoll.May 20 2020, 11:31 AM

Changes outside LangRef LGTM; please commit separately.

llvm/docs/LangRef.rst
14900	I think each of these intrinsics needs a "semantics" section that specifies the bare semantics of each intrinsic, without mentioning anything about loops/control flow, assuming that's possible.
14919	Signature here is wrong?

samparker added inline comments.May 20 2020, 11:17 PM

llvm/docs/LangRef.rst
14899	Why the mention of vector?
14905	Maybe use 'trip count' instead to match the internal terminology the we use?
14925	vector again
14927	I think these ones are always placed in the preheader's (unique?) predecessor.
14957	Probably worth stating that it doesn't wrap.

And thanks for doing this!

SjoerdMeijer mentioned this in rGb0614509a0f1: [HardwareLoops] llvm.loop.decrement.reg definition.May 21 2020, 3:13 AM

Thanks for looking at this and your comments.

Changes outside LangRef have been committed in rGb0614509a0f1.

I will pick this up soon and then address the comments.

I think this addresses the comments.

With the one change, LGTM. Thanks again.

llvm/docs/LangRef.rst
14928	Missed this bit, these should return an i1.

This revision is now accepted and ready to land.May 21 2020, 6:34 AM

efriedma added inline comments.May 21 2020, 12:13 PM

llvm/docs/LangRef.rst
14990	"but are not allowed to duplicate these intrinsic calls"? Run-on sentence, and I'm not sure what isn't allowed to duplicate the intrinsic calls.
15022	Where does "llvm.loop.decrement" retrieve the loop iteration counter from? I think we need to describe this a bit more carefully. For the others, there's sort of a trivial lowering back to plain LLVM IR, but here I'm not sure what that looks like.

Thanks for catching that unclear part of the llvm.loop.decrement description, hopefully that's clearer now.

I have left the remarks about NoDuplicate in (for now), fixed that sentence, because I think the idea is that we for example don't want to a block to be cloned/copied, (partly) destroying the hardware loop structure.

The descriptions here related to llvm.loop.decrement still aren't really sufficient. The description of llvm.set.loop.iterations says "It's a hint to the backend". But llvm.loop.decrement is apparently depending on llvm.set.loop.iterations to have some sort of side-effect.

Really, I suspect the problem here is that llvm.loop.decrement is actually just unsound, and we avoid any issues with it simply by running it really late in the pass pipeline. But if that's the case, I'd prefer to state that explicitly.

llvm/docs/LangRef.rst
15005	"llvm.loop.decrement.i64", not "llvm.loop.decrement.reg.i64"

In D80316#2051371, @efriedma wrote:

The descriptions here related to llvm.loop.decrement still aren't really sufficient. The description of llvm.set.loop.iterations says "It's a hint to the backend". But llvm.loop.decrement is apparently depending on llvm.set.loop.iterations to have some sort of side-effect.

Really, I suspect the problem here is that llvm.loop.decrement is actually just unsound, and we avoid any issues with it simply by running it really late in the pass pipeline. But if that's the case, I'd prefer to state that explicitly.

Yep, agreed. I was unfamiliar with llvm.loop.decrement, looked into it today, but was struggling with it. I will have a look again to see if I haven't missed anything. And llvm.set.loop.iterations is probably very dubious too, because it just sits there, floating around. As you said, we get away with this because this is run really late. Probably slightly better is to let llvm.set.loop.iterations produce a value, so that it probably kind of models a move of the iteration count to the iteration count register, which can be picked up. But I don't have the bandwidth to do this right now: first I would like to document the current state of the art here, then finish D79100 and D79783, and after that I would like to return to this. So I will add a statement about unsoundness of this.

Probably slightly better is to let llvm.set.loop.iterations produce a value, so that it probably kind of models a move of the iteration count to the iteration count register, which can be picked up.

If you were actually looking at changing this, probably simplest would be to kill off llvm.loop.decrement completely, and convert the PPC hardware loops to work more like the ARM hardwareloops. Without llvm.loop.decrement, each intrinsic has an obvious conversion: llvm.set.loop.iterations is a no-op, llvm.test.set.loop.iterations is an icmp, llvm.loop.decrement.reg is a sub.

llvm/docs/LangRef.rst
14880	Thinking about it a bit more, I'm not sure we really want to promise these are stable. They're sort of an internal implementation detail, and frontend and mid-level optimizations don't really have any business generating them. Maybe explicitly state these may be modified in the future, and are not intended to be used outside the backend.

SjoerdMeijer added inline comments.May 27 2020, 12:36 AM

llvm/docs/LangRef.rst
14880	Agreed, and thanks, will add this. For exactly these reasons I could justify to myself that we hadn't document/exposed them before, but that is not entirely correct...

Added a paragraph that these intrinsics should not be used outside the backend.

LGTM

Closed by commit rG880c35a55495: [HardwareLoops] LangRef Intrinsic descriptions (authored by SjoerdMeijer). · Explain WhyMay 28 2020, 1:02 AM

This revision was automatically updated to reflect the committed changes.

Diff 266764

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 14,865 Lines • ▼ Show 20 Lines
	Examples:			Examples:
	"""""""""			"""""""""

	.. code-block:: llvm			.. code-block:: llvm

	%r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c			%r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c


				Hardware-Loop Intrinsics
				------------------------

				LLVM support several intrinsics to mark a loop as a hardware-loop. They are
				hints to the backend which are required to lower these intrinsics further to target
				specific instructions, or revert the hardware-loop to a normal loop if target
				specific restriction are not met and a hardware-loop can't be generated.
				efriedmaUnsubmitted Not Done Reply Inline Actions Thinking about it a bit more, I'm not sure we really want to promise these are stable. They're sort of an internal implementation detail, and frontend and mid-level optimizations don't really have any business generating them. Maybe explicitly state these may be modified in the future, and are not intended to be used outside the backend. efriedma: Thinking about it a bit more, I'm not sure we really want to promise these are stable. They're…
				SjoerdMeijerAuthorUnsubmitted Not Done Reply Inline Actions Agreed, and thanks, will add this. For exactly these reasons I could justify to myself that we hadn't document/exposed them before, but that is not entirely correct... SjoerdMeijer: Agreed, and thanks, will add this. For exactly these reasons I could justify to myself that we…

				These intrinsics may be modified in the future and are not intended to be used
				outside the backend. Thus, front-end and mid-level optimizations should not be
				generating these intrinsics.


				'``llvm.set.loop.iterations.*``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				This is an overloaded intrinsic.

				::

				declare void @llvm.set.loop.iterations.i32(i32)
				declare void @llvm.set.loop.iterations.i64(i64)

				samparkerUnsubmitted Not Done Reply Inline Actions Why the mention of vector? samparker: Why the mention of vector?
				Overview:
				efriedmaUnsubmitted Not Done Reply Inline Actions I think each of these intrinsics needs a "semantics" section that specifies the bare semantics of each intrinsic, without mentioning anything about loops/control flow, assuming that's possible. efriedma: I think each of these intrinsics needs a "semantics" section that specifies the bare semantics…
				"""""""""

				The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the
				hardware-loop trip count. They are placed in the loop preheader basic block and
				are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these
				samparkerUnsubmitted Not Done Reply Inline Actions Maybe use 'trip count' instead to match the internal terminology the we use? samparker: Maybe use 'trip count' instead to match the internal terminology the we use?
				instructions.

				Arguments:
				""""""""""

				The integer operand is the loop trip count of the hardware-loop, and thus
				not e.g. the loop back-edge taken count.

				Semantics:
				""""""""""

				The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic
				on their operand. It's a hint to the backend that can use this to set up the
				hardware-loop count with a target specific instruction, usually a move of this
				efriedmaUnsubmitted Not Done Reply Inline Actions Signature here is wrong? efriedma: Signature here is wrong?
				value to a special register or a hardware-loop instruction.

				'``llvm.test.set.loop.iterations.*``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				samparkerUnsubmitted Not Done Reply Inline Actions vector again samparker: vector again
				"""""""

				samparkerUnsubmitted Not Done Reply Inline Actions I think these ones are always placed in the preheader's (unique?) predecessor. samparker: I think these ones are always placed in the preheader's (unique?) predecessor.
				This is an overloaded intrinsic.
				samparkerUnsubmitted Not Done Reply Inline Actions Missed this bit, these should return an i1. samparker: Missed this bit, these should return an i1.

				::

				declare void @llvm.test.set.loop.iterations.i32(i32)
				declare void @llvm.test.set.loop.iterations.i64(i64)

				Overview:
				"""""""""

				The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the
				the loop trip count, and also test that the given count is not zero, allowing
				it to control entry to a while-loop. They are placed in the loop preheader's
				predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid
				optimizers duplicating these instructions.

				Arguments:
				""""""""""

				The integer operand is the loop trip count of the hardware-loop, and thus
				not e.g. the loop back-edge taken count.

				Semantics:
				""""""""""

				The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any
				arithmetic on their operand. It's a hint to the backend that can use this to
				set up the hardware-loop count with a target specific instruction, usually a
				move of this value to a special register or a hardware-loop instruction.

				samparkerUnsubmitted Not Done Reply Inline Actions Probably worth stating that it doesn't wrap. samparker: Probably worth stating that it doesn't wrap.
				'``llvm.loop.decrement.reg.*``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				This is an overloaded intrinsic.

				::

				declare i32 @llvm.loop.decrement.reg.i32(i32, i32)
				declare i64 @llvm.loop.decrement.reg.i64(i64, i64)

				Overview:
				"""""""""

				The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop
				iteration counter and return an updated value that will be used in the next
				loop test check.

				Arguments:
				""""""""""

				Both arguments must have identical integer types. The first operand is the
				loop iteration counter. The second operand is the maximum number of elements
				processed in an iteration.

				Semantics:
				""""""""""

				The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its
				two operands, which is not allowed to wrap. They return the remaining number of
				iterations still to be executed, and can be used together with a ``PHI``,
				efriedmaUnsubmitted Not Done Reply Inline Actions "but are not allowed to duplicate these intrinsic calls"? Run-on sentence, and I'm not sure what isn't allowed to duplicate the intrinsic calls. efriedma: "but are not allowed to duplicate these intrinsic calls"? Run-on sentence, and I'm not sure…
				``ICMP`` and ``BR`` to control the number of loop iterations executed. Any
				optimisations are allowed to treat it is a ``SUB``, and it is supported by
				SCEV, so it's the backends responsibility to handle cases where it may be
				optimised. These intrinsics are marked as ``IntrNoDuplicate`` to avoid
				optimizers duplicating these instructions.


				'``llvm.loop.decrement.*``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				This is an overloaded intrinsic.

				efriedmaUnsubmitted Not Done Reply Inline Actions "llvm.loop.decrement.i64", not "llvm.loop.decrement.reg.i64" efriedma: "llvm.loop.decrement.i64", not "llvm.loop.decrement.reg.i64"
				::

				declare i1 @llvm.loop.decrement.i32(i32)
				declare i1 @llvm.loop.decrement.i64(i64)

				Overview:
				"""""""""

				The HardwareLoops pass allows the loop decrement value to be specified with an
				option. It defaults to a loop decrement value of 1, but it can be an unsigned
				integer value provided by this option. The '``llvm.loop.decrement.*``'
				intrinsics decrement the loop iteration counter with this value, and return a
				false predicate if the loop should exit, and true otherwise.
				This is emitted if the loop counter is not updated via a ``PHI`` node, which
				can also be controlled with an option.

				Arguments:
				efriedmaUnsubmitted Not Done Reply Inline Actions Where does "llvm.loop.decrement" retrieve the loop iteration counter from? I think we need to describe this a bit more carefully. For the others, there's sort of a trivial lowering back to plain LLVM IR, but here I'm not sure what that looks like. efriedma: Where does "llvm.loop.decrement" retrieve the loop iteration counter from? I think we need to…
				""""""""""

				The integer argument is the loop decrement value used to decrement the loop
				iteration counter.

				Semantics:
				""""""""""

				The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration
				counter with the given loop decrement value, and return false if the loop
				should exit, this ``SUB`` is not allowed to wrap. The result is a condition
				that is used by the conditional branch controlling the loop.


	Experimental Vector Reduction Intrinsics			Experimental Vector Reduction Intrinsics
	----------------------------------------			----------------------------------------

	Horizontal reductions of vectors can be expressed using the following			Horizontal reductions of vectors can be expressed using the following
	intrinsics. Each one takes a vector operand as an input and applies its			intrinsics. Each one takes a vector operand as an input and applies its
	respective operation across all elements of the vector, returning a single			respective operation across all elements of the vector, returning a single
	scalar result of the same element type.			scalar result of the same element type.

	▲ Show 20 Lines • Show All 5,083 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[HardwareLoops] Intrinsic LangRef descriptions
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 266764

llvm/docs/LangRef.rst

This is an archive of the discontinued LLVM Phabricator instance.

[HardwareLoops] Intrinsic LangRef descriptionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 266764

llvm/docs/LangRef.rst

[HardwareLoops] Intrinsic LangRef descriptions
ClosedPublic