This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
6/8
LangRef.rst
-
include/llvm/Transforms/Vectorize/
-
llvm/
-
Transforms/
-
Vectorize/
3/3
LoopVectorizationLegality.h
-
lib/Transforms/Vectorize/
-
Transforms/
-
Vectorize/
10/10
LoopVectorizationLegality.cpp
-
LoopVectorize.cpp
-
test/Transforms/LoopVectorize/
-
Transforms/
-
LoopVectorize/
10/11
metadata-width.ll

Differential D88962

[SVE] Add support for scalable vectors with vectorize.scalable.enable loop attribute
ClosedPublic

Authored by david-arm on Oct 7 2020, 6:13 AM.

Download Raw Diff

Details

Reviewers

sdesmalen
ctetreau
paulwalker-arm
kmclaughlin
efriedma
vkmr
fhahn
jdoerfert
SjoerdMeijer

Commits

rG71bd59f0cb6d: [SVE] Add support for scalable vectors with vectorize.scalable.enable loop…

Summary

In this patch I have added support for a new loop hint called
vectorize.scalable.enable that says whether we should enable scalable
vectorization or not. If a user wants to instruct the compiler to
vectorize a loop with scalable vectors they can now do this as
follows:

br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !2
...
!2 = !{!2, !3, !4}
!3 = !{!"llvm.loop.vectorize.width", i32 8}
!4 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}

Setting the hint to false simply reverts the behaviour back to the
default, using fixed width vectors.

Diff Detail

Event Timeline

david-arm created this revision.Oct 7 2020, 6:13 AM

Herald added a reviewer: efriedma. · View Herald TranscriptOct 7 2020, 6:13 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, psnobl, hiraditya, tschuett. · View Herald Transcript

david-arm requested review of this revision.Oct 7 2020, 6:13 AM

Harbormaster completed remote builds in B74264: Diff 296653.Oct 7 2020, 6:26 AM

david-arm added a reviewer: vkmr.Oct 7 2020, 6:32 AM

Could you also update the documentation for llvm.loop.vectorize.width? https://llvm.org/docs/LangRef.html#llvm-loop-vectorize-width-metadata

llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
54	might be good to add comments indicating what the fields are used for.
llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
67	are you anticipating additional hints with ElementCount metadata? If not, it might be simpler to just deal with HK_WIDTH up front and validate Val[0] and Val[1] here and leave the code handling the other cases mostly unchanged? Or maybe use named variables instead of an array, to make things a bit clearer?

This revision now requires changes to proceed.Oct 7 2020, 7:23 AM

david-arm added inline comments.Oct 8 2020, 12:52 AM

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
67	So the reason I structured it like this is because we are still maintaining backwards compatibility with the old style and allowing single integer constants instead of the node. I could deal with the HK_WIDTH up front, but I'd still need all this code. Would you be ok with using named variables instead?

david-arm updated this revision to Diff 296895.Oct 8 2020, 1:51 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptOct 8 2020, 1:51 AM

david-arm marked 2 inline comments as done.Oct 8 2020, 1:52 AM

I've updated the documentation for the new attribute format.
Added comments to the union members.
Renamed the variables in validateAndSet.

david-arm added a child revision: D89031: [SVE] Add support to vectorize_width loop pragma for scalable vectors.Oct 8 2020, 3:00 AM

sdesmalen added inline comments.Oct 16 2020, 1:29 AM

llvm/docs/LangRef.rst

5921

Rather than talking about two forms, is it sufficient to say:

The vector width is an ElementCount tuple, represented in Metadata as:

.. code-block:: llvm

   !0 = !{!"llvm.loop.vectorize.width", !1}
   !1 = !{i32 4, i32 1}

where ``i32 4`` specifies the vector width and ``i32 1`` indicates if the vectorization factor is scalable, meaning that the loop-vectorizer should use vector-length agnostic vectorization.

For fixed-width vectorizatoin-factors, a short-hand `i32` operand for llvm.loop.vectorize.width is also supported.

.. code-block:: llvm
    !0 = !{!"llvm.loop.vectorize.width", i32 4}

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

nit: For the cases below, is it worthing using something like:

auto MaySetIntValue = [this](int IntVal, bool Condition) { if (Condition) this->Value.U32 = IntVal; return Condition; };
auto MaySetECValue = [this](ElementCount EC, bool Condition) { if (Condition) this->Value.EC = EC; return Condition; };

switch (Kind) {
case HK_WIDTH:
  return MaySetECValue(ElementCount::get(IntVal, IsScalable), isPowerOf2_32(IntVal) && IntVal <= VectorizerParams::MaxVectorWidth);
case HK_UNROLL:
  return MaySetIntValue(IntVal, isPowerOf2_32(Val) && Val <= MaxInterleaveFactor);
[...]
}

Updated documentation for vectorize_width loop attribute.
Added lambda functions to be used when validating and setting loop hint attributes.

LGTM with nits addressed. @fhahn are you happy with the changes?

llvm/docs/LangRef.rst
5909	nit: s/i32 1/i1 true/
5911	nit: s/vector width/minimum known vector width/ s/and `i32 1`/and the non-zero value `i1 true`/
llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
61	nit: `s/unsigned IsScalable = 0;/bool IsScalable = false;/`
79	nit: `s/Maybe/Conditionally/`
85	nit: this is only used once, so better to inline.
llvm/test/Transforms/LoopVectorize/metadata-width.ll
55	nit: `s/i32 0/i1 false/`

david-arm added inline comments.Oct 27 2020, 2:56 AM

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
61	Given that it is possible to specify this as "i32" type, even if the docs says "i1", are you saying that it's fine to specify the vector width as "i32 4, i32 345", for example? In this example with your changes "i32 345" will be treated as scalable = true. This is the reason I had this value as i32, in order to capture odd cases.

In D88962#2354863, @sdesmalen wrote:

LGTM with nits addressed. @fhahn are you happy with the changes?

LG in general, thanks for the updates.

llvm/docs/LangRef.rst
5911	nit: I'd start with saying that the first value of the tuple is the vector width and the second value indicates whether the vectorization factor is scalable and then follow with the example saying this means min vector width of 4 & scalable vectorization

david-arm added inline comments.Oct 27 2020, 7:43 AM

llvm/test/Transforms/LoopVectorize/metadata-width.ll
55	If I make this change the test fails. We will still always print out "i32 0". Is there a way to control the format generated?

david-arm added inline comments.Oct 27 2020, 7:44 AM

llvm/test/Transforms/LoopVectorize/metadata-width.ll
55	Sorry I just realised this is an input to the test, rather than testing the output! Please ignore my comment.

sdesmalen added a child revision: D90342: [POC][LoopVectorizer] Propagate ElementCount to interfaces in preparation for scalable auto-vec. .Oct 28 2020, 2:12 PM

Updated documentation.
Addressed review comments.

david-arm marked 9 inline comments as done.Oct 29 2020, 9:39 AM

Hi @fhahn are you happy with the changes now and happy to accept them?

This comment has been deleted.

llvm/docs/LangRef.rst
5904	ElementCount is an internal name of the LLVM codebase I guess and nothing defines it in LangRef? I think it would be better to drop it, because here it is just a regular metadata tuple and you already outline what the fields hold.
llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
63	nit: using `{0}` seems inconsistent with the other braces used here?
llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
66	might be good to a comment that we are looking for a (width, isScalable) tuple here.
267	`C` is only used in the condition, maybe just check `mdconst::dyn_extract<ConstantInt>(Arg)` in the if? But now that we pass the metadata node to validateAndSet, do we actually need the check here or can we just do all the validation in the `validateAndSet`? Seems like the time-savings by existing a bit earlier would be very minor.
llvm/test/Transforms/LoopVectorize/metadata-width.ll
55	also needs a test which actually sets `isScalable` to `true`?

david-arm marked an inline comment as done.Nov 5 2020, 4:38 AM

david-arm added inline comments.

llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
63	Good spot. Since I'm initialising Value explictly below I can simply remove "Value{0}, " entirely.
llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
267	Sure, sounds sensible!
llvm/test/Transforms/LoopVectorize/metadata-width.ll
55	Unfortunately I can't do that at the moment because the vectorizer crashes when trying to perform scalable vectorisation. This is why I only tested the 'false' case. In theory I could create a negative test that tests the vectoriser crashes, which we could remove later once we support vectorisation. Any thoughts @fhahn or @sdesmalen ?

david-arm updated this revision to Diff 303134.Nov 5 2020, 8:50 AM

david-arm marked an inline comment as done.

david-arm marked 4 inline comments as done.

sdesmalen added inline comments.Nov 5 2020, 8:59 AM

llvm/test/Transforms/LoopVectorize/metadata-width.ll
55	The proof of concept for scalable loop vectorization (D90343) depends on this patch and shows the `isScalable = true` case does something sensible with this information. Hopefully that is sufficient for now while we work on actually implementing scalable auto-vec support in the LoopVectorizer.

[this time adding the comment to the right patch]

Hi @david-arm I just found that two uses of llvm.loop.vectorize.width are not yet updated.

WarnMissedTransforms.cpp in warnAboutLeftoverTransformations.
LoopUtils.cpp in llvm::hasVectorizeTransformation.

The cases seem quite trivial to fix up, can you include those changes in this patch?

Fixed up some remaining vectorize.width attribute cases.

fhahn added inline comments.Nov 9 2020, 12:53 PM

llvm/test/Transforms/LoopVectorize/metadata-width.ll
55	Hm, I am a bit reluctant to add something to LangRef which causes passes to crash on valid IR. Would it be possible to either convert the scalable VF into a fixed one (replacing n with 1 should be valid?), or just bail out on a scalable VF, without too much effort?

sdesmalen added inline comments.Nov 9 2020, 1:40 PM

llvm/test/Transforms/LoopVectorize/metadata-width.ll
55	Hm, I am a bit reluctant to add something to LangRef which causes passes to crash on valid IR. Would it be possible to either convert the scalable VF into a fixed one (replacing n with 1 should be valid?), or just bail out on a scalable VF, without too much effort? That would lead to a bit of a chicken/egg problem, because we're adding this feature so we can incrementally make the loop vectorizer cope with scalable vectors. There are currently too many code-paths in there to fix up in one go. D90343 uses this metadata to enable vectorization of an individual loop with scalable vectors and gives us a route to enable and test individual loops in unit tests or in existing code (using the Clang attribute in D89031). The fact that the vectorizer currently crashes on most inputs, is a bug, it just happens to be a bug we know about. It is also a temporary broken state that we will want fix as soon as possible. Is it not possible to add a line to the LangRef saying that this is experimental until the loop vectorizer supports scalable vectors? (and we remove that line when the vectorizer is stable enough).
llvm/test/Transforms/LoopVectorize/no_array_bounds2.ll
3 ↗	(On Diff #303482)	Can you add a description of what this file is testing? For the file name itself, is there a better name than `no_array_bounds2.ll` ?

fhahn added inline comments.Nov 10 2020, 3:20 AM

llvm/test/Transforms/LoopVectorize/metadata-width.ll
55	That would lead to a bit of a chicken/egg problem, because we're adding this feature so we can incrementally make the loop vectorizer cope with scalable vectors. There are currently too many code-paths in there to fix up in one go. D90343 uses this metadata to enable vectorization of an individual loop with scalable vectors and gives us a route to enable and test individual loops in unit tests or in existing code (using the Clang attribute in D89031). The fact that the vectorizer currently crashes on most inputs, is a bug, it just happens to be a bug we know about. It is also a temporary broken state that we will want fix as soon as possible. That makes sense, my suggestion was to just convert the scalable user VF to a fixed one after getting the hint instead of crashing. As in // Get user vectorization factor and interleave count. ElementCount UserVF = Hints.getWidth(); + if (UserVF.isScalable()) { + // TODO: Use scalable user VF, once LV is ready. For now, just assume n == 1. + UserVF = ElementCount::getFixed(UserVF.getKnownMinValue()); + } unsigned UserIC = Hints.getInterleave(); Only `processLoopInVPlanNativePath` and `processLoop` would need to be updated I think. as we are free to drop the hints, I think that would be preferable to crashing and should be legal?

sdesmalen added inline comments.Nov 10 2020, 4:12 AM

llvm/test/Transforms/LoopVectorize/metadata-width.ll
55	Okay, so you mean assuming Fixed for now and then re-enable this in e.g. D91077 (which adds scalable vector support for some simple loop, but which doesn't yet guarantee that it won't crash on all inputs). Yes, I think that makes sense.

fhahn added inline comments.Nov 10 2020, 4:17 AM

llvm/test/Transforms/LoopVectorize/metadata-width.ll
55	+1

sdesmalen added a child revision: D91077: [LoopVectorizer][SVE] Vectorize a simple loop with with a scalable VF..Nov 10 2020, 6:30 AM

In LoopVectorize.cpp I've forced the UserVF to fall back on fixed width vectorisation for now.
Added comments to the new test file and renamed it.

david-arm marked 6 inline comments as done.Nov 11 2020, 4:33 AM

Can you add a test-case to llvm/test/Transforms/LoopVectorize/metadata-width.ll where it also tries to compile a loop with scalable VF? (and ensure it falls back to the fixed-width case until we add support for it)

llvm/docs/LangRef.rst
5904–5905	nit: @c-rhodes and I noticed yesterday that the term `minimum vector width` can give confusion, because it is not the width of the vector that the metadata describes (in terms of bytes), but rather the vectorization factor, i.e. the (minimum) number of vector lanes used to vectorize a loop. Can you clarify that in the description?

sdesmalen mentioned this in D91077: [LoopVectorizer][SVE] Vectorize a simple loop with with a scalable VF..Nov 16 2020, 7:38 AM

Reverted changes to vectorize.width loop hint attribute.
Added new vectorize.scalable.enable loop hint attribute to control scalable vectorisation.

After discussions on the child patch (D89031) we've decided to do things differently so instead of adapting vectorize.width to accept a tuple we're now adding a new vectorize.scalable.enable hint that can be used with vectorize.width to create a ElementCount. For the clang patch I also intend to update it so that we use a new vectorize_style #pragma instead of using vectorize_width to force scalable vectorisation.

Hi @david-arm, can you document the reasons for the change?

Personally, I don't care much about the details of the metadata layout so if this switched layout represents value then great. However, I'm less happy with the proposed pragma change because there's been a lot of effort spent (with plenty more to come) converting existing fixed length only representations of vectors into new types that can represent both fixed and scalable vectors. To then force this information to be split into its constitute parts at the user level (i.e. pragmas and command line options) seems like a backward step.

Hi @paulwalker-arm, so the reason for this change is related to the child patch where we changed the vectorize_width #pragma to accept an additional optional argument "scalable" or "fixed". @SjoerdMeijer felt this was the wrong approach and that the scalable property should be specified only as an additional pragma, i.e.

#pragma clang loop vectorize_width(4) vectorize_style(scalable)

The reason for this is that we would have to add vectorize_style(scalable) anyway to support vectorisation where the width isn't specified, i.e.

#pragma clang loop some_other_pragma vectorize_style(scalable)

Also, if we support both a vectorize_style(scalable|fixed) pragma in addition to specifying the scalable property in vectorize_width it also means extra work managing potential conflicts. As a result of this change in approach it made sense to update the LLVM loop hint attribute too to reflect this.

Thanks for the info @david-arm. I just figured we'd support vector_width(2), vector_width(2, fixed), vector_width(2, scalable), vector_width(fixed), vector_width(scalable) so I still say splitting the width property across multiple pragmas is against our goal of moving away from fixed length only representations. That said, if this is the consensus then so be it.

In D88962#2402292, @paulwalker-arm wrote:

Thanks for the info @david-arm. I just figured we'd support vector_width(2), vector_width(2, fixed), vector_width(2, scalable), vector_width(fixed), vector_width(scalable) so I still say splitting the width property across multiple pragmas is against our goal of moving away from fixed length only representations. That said, if this is the consensus then so be it.

I won't block D89031 and would be happy to be convinced otherwise, but I don't see any advantages of D89031, in fact I see only disadvantages.

The approach here looks good to me; have inlined a question.

llvm/docs/LangRef.rst
5901	I am wondering if we need to describe if and how this interacts with `llvm.loop.vectorize.enable`?
llvm/test/Transforms/LoopVectorize/no_array_bounds_scalable.ll
65 ↗	(On Diff #306006)	nit: we probably don't need all of this.

Updated the documentation to mention that the new attribute only has any effect if vectorisation is enabled for the loop.
Cleaned up one of the test files.

Hi @fhahn @SjoerdMeijer @paulwalker-arm, are you happy with the patch now? The implementation on the LLVM IR side of things is independent of the clang pragma patch. For the clang patch I intend to write a proposal to the mailing list for changing the vectorize_width #pragma. @fhahn would you be happy with removing the "Request changes" cross?

llvm/docs/LangRef.rst
5901	I adapted this documentation from the vectorize.predicate.enable case above, which didn't discuss the interaction with vectorize.enable so I just thought I didn't need to here either. I've made it clear that this flag only has any effect if vectorisation is already enabled (although this could be through simply building with -O2).

The updated metadata looks fine and should be flexible enough to cover the scenarios outlined by @paulwalker-arm, whatever the clang support looks like. A few small remaining comments.

llvm/include/llvm/Transforms/Utils/LoopUtils.h
217 ↗	(On Diff #307029)	It this is exposed here, this should probably have a comment describing what this does?
llvm/lib/Transforms/Utils/LoopUtils.cpp
306 ↗	(On Diff #307029)	Here we get both the integer values of `llvm.loop.vectorize.width` and `llvm.loop.vectorize.scalable.enable"`, right? Could this be simplified by using the existing `getOptionalIntLoopAttribute`?
llvm/test/Transforms/LoopVectorize/no_array_bounds_scalable.ll
65 ↗	(On Diff #306006)	looks like the TBAA metadata still can be removed?

Added documentation to getOptionalElementCountLoopAttribute and simplified the implementation in the function.
Removed some tbaa metadata from the tests.

david-arm marked 3 inline comments as done.Nov 25 2020, 1:22 AM

LGTM, thanks. Given that there has been quite some discussion, it would be good to wait with landing this for a few days, in case there are any more comments.

llvm/include/llvm/Transforms/Utils/LoopUtils.h
220 ↗	(On Diff #307537)	nit: drop `llvm::` (there are some inconsistencies about that in the file, but it should not be required and it is also not used for other things in the `llvm` namespace, like `Loop`).
llvm/lib/Transforms/Utils/LoopUtils.cpp
475 ↗	(On Diff #307537)	nit: the extra `()` are not needed?

This revision is now accepted and ready to land.Nov 30 2020, 1:01 AM

I am also happy with this (and the new proposal in D89031 and the mail to cfe dev list).

What do we do when vectorize.scalable is not supported by the target? We want to issue a diagnostic/remark? Do we need to say something about this in the LangRef part?

Hi @SjoerdMeijer, what I have in the clang pragma patch at the moment is a warning emitted when the target doesn't support scalable vectors, and falling back on fixed width. However, this is something for discussion as I believe there are others who think that the fronted should simply accept the scalable hint regardless and let the vectoriser make that decision based upon cost analysis.

sdesmalen mentioned this in D91718: [LV] Legalize scalable VF hints.Nov 30 2020, 6:57 AM

In D88962#2422190, @SjoerdMeijer wrote:

What do we do when vectorize.scalable is not supported by the target? We want to issue a diagnostic/remark? Do we need to say something about this in the LangRef part?

Yes, although I think that's worth addressing separately, I made a similar comment on D91718 where such functionality is needed. If that's addressed separately, are you happy @david-arm to land this patch @SjoerdMeijer?

(it is worth noting that this patch still ignores the 'scalable' VF in the vectorizer. This is only enabled again in D91077, which has a dependence on this patch)

Yep, LGTM too

Closed by commit rG71bd59f0cb6d: [SVE] Add support for scalable vectors with vectorize.scalable.enable loop… (authored by david-arm). · Explain WhyDec 2 2020, 5:24 AM

This revision was automatically updated to reflect the committed changes.

david-arm added a commit: rG71bd59f0cb6d: [SVE] Add support for scalable vectors with vectorize.scalable.enable loop….

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

20 lines

include/

llvm/

Transforms/

Vectorize/

LoopVectorizationLegality.h

33 lines

lib/

Transforms/

Vectorize/

LoopVectorizationLegality.cpp

89 lines

LoopVectorize.cpp

10 lines

test/

Transforms/

LoopVectorize/

metadata-width.ll

25 lines

Diff 298980

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 5,892 Lines • ▼ Show 20 Lines
	the bit operand value is 1 vectorization is enabled. A value of 0 disables			the bit operand value is 1 vectorization is enabled. A value of 0 disables
	vectorization:			vectorization:

	.. code-block:: llvm			.. code-block:: llvm

	!0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}			!0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}
	!1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}			!1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}

	'``llvm.loop.vectorize.width``' Metadata			'``llvm.loop.vectorize.width``' Metadata
				SjoerdMeijerUnsubmitted Not Done Reply Inline Actions I am wondering if we need to describe if and how this interacts with `llvm.loop.vectorize.enable`? SjoerdMeijer: I am wondering if we need to describe if and how this interacts with `llvm.loop.vectorize.
				david-armAuthorUnsubmitted Done Reply Inline Actions I adapted this documentation from the vectorize.predicate.enable case above, which didn't discuss the interaction with vectorize.enable so I just thought I didn't need to here either. I've made it clear that this flag only has any effect if vectorisation is already enabled (although this could be through simply building with -O2). david-arm: I adapted this documentation from the vectorize.predicate.enable case above, which didn't…
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	This metadata sets the target width of the vectorizer. The first			The vector width is an ElementCount tuple, represented in Metadata as:
				fhahnUnsubmitted Done Reply Inline Actions ElementCount is an internal name of the LLVM codebase I guess and nothing defines it in LangRef? I think it would be better to drop it, because here it is just a regular metadata tuple and you already outline what the fields hold. fhahn: ElementCount is an internal name of the LLVM codebase I guess and nothing defines it in LangRef?
	operand is the string ``llvm.loop.vectorize.width`` and the second
	operand is an integer specifying the width. For example:

				sdesmalenUnsubmitted Not Done Reply Inline Actions nit: @c-rhodes and I noticed yesterday that the term `minimum vector width` can give confusion, because it is not the width of the vector that the metadata describes (in terms of bytes), but rather the vectorization factor, i.e. the (minimum) number of vector lanes used to vectorize a loop. Can you clarify that in the description? sdesmalen: nit: @c-rhodes and I noticed yesterday that the term `minimum vector width` can give confusion…
	.. code-block:: llvm			.. code-block:: llvm

	!0 = !{!"llvm.loop.vectorize.width", i32 4}			!0 = !{!"llvm.loop.vectorize.width", !1}
				!1 = !{i32 4, i32 1}
				sdesmalenUnsubmitted Done Reply Inline Actions nit: s/i32 1/i1 true/ sdesmalen: nit: s/i32 1/i1 true/

				where ``i32 4`` specifies the vector width and ``i32 1`` indicates if the
				sdesmalenUnsubmitted Done Reply Inline Actions nit: s/vector width/minimum known vector width/ s/and `i32 1`/and the non-zero value `i1 true`/ sdesmalen: nit: s/vector width/minimum known vector width/ s/and `i32 1`/and the non-zero value `i1…
				fhahnUnsubmitted Done Reply Inline Actions nit: I'd start with saying that the first value of the tuple is the vector width and the second value indicates whether the vectorization factor is scalable and then follow with the example saying this means min vector width of 4 & scalable vectorization fhahn: nit: I'd start with saying that the first value of the tuple is the vector width and the second…
				vectorization factor is scalable, meaning that the loop-vectorizer should use
				vector-length agnostic vectorization.

				For fixed-width vectorization-factors, a short-hand `i32` operand for
				llvm.loop.vectorize.width is also supported:

	Note that setting ``llvm.loop.vectorize.width`` to 1 disables			.. code-block:: llvm
	vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to			!0 = !{!"llvm.loop.vectorize.width", i32 4}
	0 or if the loop does not have this metadata the width will be
	determined automatically.

	'``llvm.loop.vectorize.followup_vectorized``' Metadata			'``llvm.loop.vectorize.followup_vectorized``' Metadata
				sdesmalenUnsubmitted Done Reply Inline Actions Rather than talking about two forms, is it sufficient to say: The vector width is an ElementCount tuple, represented in Metadata as: .. code-block:: llvm !0 = !{!"llvm.loop.vectorize.width", !1} !1 = !{i32 4, i32 1} where ``i32 4`` specifies the vector width and ``i32 1`` indicates if the vectorization factor is scalable, meaning that the loop-vectorizer should use vector-length agnostic vectorization. For fixed-width vectorizatoin-factors, a short-hand `i32` operand for llvm.loop.vectorize.width is also supported. .. code-block:: llvm !0 = !{!"llvm.loop.vectorize.width", i32 4} sdesmalen: Rather than talking about two forms, is it sufficient to say: ```The vector width is an…
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	This metadata defines which loop attributes the vectorized loop will			This metadata defines which loop attributes the vectorized loop will
	have. See :ref:`transformation-metadata` for details.			have. See :ref:`transformation-metadata` for details.

	'``llvm.loop.vectorize.followup_epilogue``' Metadata			'``llvm.loop.vectorize.followup_epilogue``' Metadata
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	▲ Show 20 Lines • Show All 14,928 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h

Show All 23 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_TRANSFORMS_VECTORIZE_LOOPVECTORIZATIONLEGALITY_H		#ifndef LLVM_TRANSFORMS_VECTORIZE_LOOPVECTORIZATIONLEGALITY_H
#define LLVM_TRANSFORMS_VECTORIZE_LOOPVECTORIZATIONLEGALITY_H		#define LLVM_TRANSFORMS_VECTORIZE_LOOPVECTORIZATIONLEGALITY_H

#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/Analysis/LoopAccessAnalysis.h"		#include "llvm/Analysis/LoopAccessAnalysis.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"		#include "llvm/Analysis/OptimizationRemarkEmitter.h"
		#include "llvm/Support/TypeSize.h"
#include "llvm/Transforms/Utils/LoopUtils.h"		#include "llvm/Transforms/Utils/LoopUtils.h"

namespace llvm {		namespace llvm {

/// Utility class for getting and setting loop vectorizer hints in the form		/// Utility class for getting and setting loop vectorizer hints in the form
/// of loop metadata.		/// of loop metadata.
/// This class keeps a number of loop annotations locally (as member variables)		/// This class keeps a number of loop annotations locally (as member variables)
/// and can, upon request, write them back as metadata on the loop. It will		/// and can, upon request, write them back as metadata on the loop. It will
/// initially scan the loop for existing metadata, and will update the local		/// initially scan the loop for existing metadata, and will update the local
/// values based on information in the loop.		/// values based on information in the loop.
/// We cannot write all values to metadata, as the mere presence of some info,		/// We cannot write all values to metadata, as the mere presence of some info,
/// for example 'force', means a decision has been made. So, we need to be		/// for example 'force', means a decision has been made. So, we need to be
/// careful NOT to add them if the user hasn't specifically asked so.		/// careful NOT to add them if the user hasn't specifically asked so.
class LoopVectorizeHints {		class LoopVectorizeHints {
enum HintKind { HK_WIDTH, HK_UNROLL, HK_FORCE, HK_ISVECTORIZED,		enum HintKind { HK_WIDTH, HK_UNROLL, HK_FORCE, HK_ISVECTORIZED,
HK_PREDICATE };		HK_PREDICATE };

/// Hint - associates name and validation with the hint value.		/// Hint - associates name and validation with the hint value.
struct Hint {		struct Hint {
const char *Name;		const char *Name;
unsigned Value; // This may have to change for non-numeric values.		union {
		unsigned U32; // Used for boolean and integer hint values.
		fhahnUnsubmitted Done Reply Inline Actions might be good to add comments indicating what the fields are used for. fhahn: might be good to add comments indicating what the fields are used for.
		ElementCount EC; // Used for the vectorization width.
		} Value;
HintKind Kind;		HintKind Kind;

Hint(const char *Name, unsigned Value, HintKind Kind)		Hint(const char *Name, unsigned Value, HintKind Kind)
: Name(Name), Value(Value), Kind(Kind) {}		: Name(Name), Value({Value}), Kind(Kind) {}

bool validate(unsigned Val);		Hint(const char *Name, ElementCount EC)
		: Name(Name), Value{0}, Kind(HK_WIDTH) {
		fhahnUnsubmitted Done Reply Inline Actions nit: using `{0}` seems inconsistent with the other braces used here? fhahn: nit: using `{0}` seems inconsistent with the other braces used here?
		david-armAuthorUnsubmitted Done Reply Inline Actions Good spot. Since I'm initialising Value explictly below I can simply remove "Value{0}, " entirely. david-arm: Good spot. Since I'm initialising Value explictly below I can simply remove "Value{0}, "…
		Value.EC = EC;
		}

		bool validateAndSet(const Metadata *Arg);
};		};

/// Vectorization width.		/// Vectorization width.
Hint Width;		Hint Width;

/// Vectorization interleave factor.		/// Vectorization interleave factor.
Hint Interleave;		Hint Interleave;

Show All 26 Lines	public:
void setAlreadyVectorized();		void setAlreadyVectorized();

bool allowVectorization(Function F, Loop L,		bool allowVectorization(Function F, Loop L,
bool VectorizeOnlyWhenForced) const;		bool VectorizeOnlyWhenForced) const;

/// Dumps all the hint information.		/// Dumps all the hint information.
void emitRemarkWithHints() const;		void emitRemarkWithHints() const;

unsigned getWidth() const { return Width.Value; }		ElementCount getWidth() const { return Width.Value.EC; }
unsigned getInterleave() const { return Interleave.Value; }		unsigned getInterleave() const { return Interleave.Value.U32; }
unsigned getIsVectorized() const { return IsVectorized.Value; }		unsigned getIsVectorized() const { return IsVectorized.Value.U32; }
unsigned getPredicate() const { return Predicate.Value; }		unsigned getPredicate() const { return Predicate.Value.U32; }
enum ForceKind getForce() const {		enum ForceKind getForce() const {
if ((ForceKind)Force.Value == FK_Undefined &&		if ((ForceKind)Force.Value.U32 == FK_Undefined &&
hasDisableAllTransformsHint(TheLoop))		hasDisableAllTransformsHint(TheLoop))
return FK_Disabled;		return FK_Disabled;
return (ForceKind)Force.Value;		return (ForceKind)Force.Value.U32;
}		}

/// If hints are provided that force vectorization, use the AlwaysPrint		/// If hints are provided that force vectorization, use the AlwaysPrint
/// pass name to force the frontend to print the diagnostic.		/// pass name to force the frontend to print the diagnostic.
const char *vectorizeAnalysisPassName() const;		const char *vectorizeAnalysisPassName() const;

bool allowReordering() const {		bool allowReordering() const {
// When enabling loop hints are provided we allow the vectorizer to change		// When enabling loop hints are provided we allow the vectorizer to change
// the order of operations that is given by the scalar loop. This is not		// the order of operations that is given by the scalar loop. This is not
// enabled by default because can be unsafe or inefficient. For example,		// enabled by default because can be unsafe or inefficient. For example,
// reordering floating-point operations will change the way round-off		// reordering floating-point operations will change the way round-off
// error accumulates in the loop.		// error accumulates in the loop.
return getForce() == LoopVectorizeHints::FK_Enabled \|\| getWidth() > 1;		ElementCount EC = getWidth();
		return getForce() == LoopVectorizeHints::FK_Enabled \|\|
		EC.getKnownMinValue() > 1;
}		}

bool isPotentiallyUnsafe() const {		bool isPotentiallyUnsafe() const {
// Avoid FP vectorization if the target is unsure about proper support.		// Avoid FP vectorization if the target is unsure about proper support.
// This may be related to the SIMD unit in the target not handling		// This may be related to the SIMD unit in the target not handling
// IEEE 754 FP ops properly, or bad single-to-double promotions.		// IEEE 754 FP ops properly, or bad single-to-double promotions.
// Otherwise, a sequence of vectorized loops, even without reduction,		// Otherwise, a sequence of vectorized loops, even without reduction,
// could lead to different end results on the destination vectors.		// could lead to different end results on the destination vectors.
return getForce() != LoopVectorizeHints::FK_Enabled && PotentiallyUnsafe;		return getForce() != LoopVectorizeHints::FK_Enabled && PotentiallyUnsafe;
}		}

void setPotentiallyUnsafe() { PotentiallyUnsafe = true; }		void setPotentiallyUnsafe() { PotentiallyUnsafe = true; }

private:		private:
/// Find hints specified in the loop metadata and update local values.		/// Find hints specified in the loop metadata and update local values.
void getHintsFromMetadata();		void getHintsFromMetadata();

/// Checks string hint with one operand and set value if valid.		/// Checks string hint with one operand and set value if valid.
void setHint(StringRef Name, Metadata *Arg);		void setHint(StringRef Name, const Metadata *Arg);

/// The loop these hints belong to.		/// The loop these hints belong to.
const Loop *TheLoop;		const Loop *TheLoop;

/// Interface to emit optimization remarks.		/// Interface to emit optimization remarks.
OptimizationRemarkEmitter &ORE;		OptimizationRemarkEmitter &ORE;
};		};

▲ Show 20 Lines • Show All 348 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	static cl::opt<unsigned> PragmaVectorizeSCEVCheckThreshold(
cl::desc("The maximum number of SCEV checks allowed with a "		cl::desc("The maximum number of SCEV checks allowed with a "
"vectorize(enable) pragma"));		"vectorize(enable) pragma"));

/// Maximum vectorization interleave count.		/// Maximum vectorization interleave count.
static const unsigned MaxInterleaveFactor = 16;		static const unsigned MaxInterleaveFactor = 16;

namespace llvm {		namespace llvm {

bool LoopVectorizeHints::Hint::validate(unsigned Val) {		bool LoopVectorizeHints::Hint::validateAndSet(const Metadata *Arg) {
		unsigned IntVal;
		unsigned IsScalable = 0;
		sdesmalenUnsubmitted Done Reply Inline Actions nit: `s/unsigned IsScalable = 0;/bool IsScalable = false;/` sdesmalen: nit: `s/unsigned IsScalable = 0;/bool IsScalable = false;/`
		david-armAuthorUnsubmitted Done Reply Inline Actions Given that it is possible to specify this as "i32" type, even if the docs says "i1", are you saying that it's fine to specify the vector width as "i32 4, i32 345", for example? In this example with your changes "i32 345" will be treated as scalable = true. This is the reason I had this value as i32, in order to capture odd cases. david-arm: Given that it is possible to specify this as "i32" type, even if the docs says "i1", are you…

		if (const ConstantInt *C = mdconst::dyn_extract<ConstantInt>(Arg))
		IntVal = C->getZExtValue();
		else if (const MDNode *MD = dyn_cast<MDNode>(Arg)) {
		if (Kind != HK_WIDTH \|\| MD->getNumOperands() != 2)
		fhahnUnsubmitted Done Reply Inline Actions might be good to a comment that we are looking for a (width, isScalable) tuple here. fhahn: might be good to a comment that we are looking for a (width, isScalable) tuple here.
		return false;
		fhahnUnsubmitted Done Reply Inline Actions are you anticipating additional hints with ElementCount metadata? If not, it might be simpler to just deal with HK_WIDTH up front and validate Val[0] and Val[1] here and leave the code handling the other cases mostly unchanged? Or maybe use named variables instead of an array, to make things a bit clearer? fhahn: are you anticipating additional hints with ElementCount metadata? If not, it might be simpler…
		david-armAuthorUnsubmitted Done Reply Inline Actions So the reason I structured it like this is because we are still maintaining backwards compatibility with the old style and allowing single integer constants instead of the node. I could deal with the HK_WIDTH up front, but I'd still need all this code. Would you be ok with using named variables instead? david-arm: So the reason I structured it like this is because we are still maintaining backwards…
		sdesmalenUnsubmitted Done Reply Inline Actions nit: For the cases below, is it worthing using something like: auto MaySetIntValue = [this](int IntVal, bool Condition) { if (Condition) this->Value.U32 = IntVal; return Condition; }; auto MaySetECValue = [this](ElementCount EC, bool Condition) { if (Condition) this->Value.EC = EC; return Condition; }; switch (Kind) { case HK_WIDTH: return MaySetECValue(ElementCount::get(IntVal, IsScalable), isPowerOf2_32(IntVal) && IntVal <= VectorizerParams::MaxVectorWidth); case HK_UNROLL: return MaySetIntValue(IntVal, isPowerOf2_32(Val) && Val <= MaxInterleaveFactor); [...] } sdesmalen: nit: For the cases below, is it worthing using something like: ``` auto MaySetIntValue = [this]…
		const ConstantInt *C0 =
		mdconst::dyn_extract<ConstantInt>(MD->getOperand(0));
		const ConstantInt *C1 =
		mdconst::dyn_extract<ConstantInt>(MD->getOperand(1));
		if (!C0 \|\| !C1)
		return false;
		IntVal = C0->getZExtValue();
		IsScalable = C1->getZExtValue();
		} else
		return false;

		auto MaybeSetIntValue = [this](unsigned Val, bool Cond) {
		sdesmalenUnsubmitted Done Reply Inline Actions nit: `s/Maybe/Conditionally/` sdesmalen: nit: `s/Maybe/Conditionally/`
		if (Cond)
		this->Value.U32 = Val;
		return Cond;
		};

		auto MaybeSetECValue = [this](unsigned Val, unsigned IsScalable, bool Cond) {
		sdesmalenUnsubmitted Done Reply Inline Actions nit: this is only used once, so better to inline. sdesmalen: nit: this is only used once, so better to inline.
		if (Cond)
		this->Value.EC = ElementCount::get(Val, IsScalable);
		return Cond;
		};

switch (Kind) {		switch (Kind) {
case HK_WIDTH:		case HK_WIDTH:
return isPowerOf2_32(Val) && Val <= VectorizerParams::MaxVectorWidth;		return MaybeSetECValue(IntVal, IsScalable,
		isPowerOf2_32(IntVal) &&
		IntVal <= VectorizerParams::MaxVectorWidth &&
		IsScalable <= 1);
case HK_UNROLL:		case HK_UNROLL:
return isPowerOf2_32(Val) && Val <= MaxInterleaveFactor;		return MaybeSetIntValue(IntVal, isPowerOf2_32(IntVal) &&
		IntVal <= MaxInterleaveFactor);
case HK_FORCE:		case HK_FORCE:
return (Val <= 1);		return MaybeSetIntValue(IntVal, IntVal <= 1);
case HK_ISVECTORIZED:		case HK_ISVECTORIZED:
case HK_PREDICATE:		case HK_PREDICATE:
return (Val == 0 \|\| Val == 1);		return MaybeSetIntValue(IntVal, IntVal == 0 \|\| IntVal == 1);
}		}
return false;		return false;
}		}

LoopVectorizeHints::LoopVectorizeHints(const Loop *L,		LoopVectorizeHints::LoopVectorizeHints(const Loop *L,
bool InterleaveOnlyWhenForced,		bool InterleaveOnlyWhenForced,
OptimizationRemarkEmitter &ORE)		OptimizationRemarkEmitter &ORE)
: Width("vectorize.width", VectorizerParams::VectorizationFactor, HK_WIDTH),		: Width("vectorize.width",
		ElementCount::getFixed(VectorizerParams::VectorizationFactor)),
Interleave("interleave.count", InterleaveOnlyWhenForced, HK_UNROLL),		Interleave("interleave.count", InterleaveOnlyWhenForced, HK_UNROLL),
Force("vectorize.enable", FK_Undefined, HK_FORCE),		Force("vectorize.enable", FK_Undefined, HK_FORCE),
IsVectorized("isvectorized", 0, HK_ISVECTORIZED),		IsVectorized("isvectorized", 0, HK_ISVECTORIZED),
Predicate("vectorize.predicate.enable", FK_Undefined, HK_PREDICATE), TheLoop(L),		Predicate("vectorize.predicate.enable", FK_Undefined, HK_PREDICATE),
ORE(ORE) {		TheLoop(L), ORE(ORE) {
// Populate values with existing loop metadata.		// Populate values with existing loop metadata.
getHintsFromMetadata();		getHintsFromMetadata();

// force-vector-interleave overrides DisableInterleaving.		// force-vector-interleave overrides DisableInterleaving.
if (VectorizerParams::isInterleaveForced())		if (VectorizerParams::isInterleaveForced())
Interleave.Value = VectorizerParams::VectorizationInterleave;		Interleave.Value.U32 = VectorizerParams::VectorizationInterleave;

if (IsVectorized.Value != 1)		if (IsVectorized.Value.U32 != 1)
// If the vectorization width and interleaving count are both 1 then		// If the vectorization width and interleaving count are both 1 then
// consider the loop to have been already vectorized because there's		// consider the loop to have been already vectorized because there's
// nothing more that we can do.		// nothing more that we can do.
IsVectorized.Value = Width.Value == 1 && Interleave.Value == 1;		IsVectorized.Value.U32 = Width.Value.EC == ElementCount::getFixed(1) &&
LLVM_DEBUG(if (InterleaveOnlyWhenForced && Interleave.Value == 1) dbgs()		Interleave.Value.U32 == 1;
		LLVM_DEBUG(if (InterleaveOnlyWhenForced && Interleave.Value.U32 == 1) dbgs()
<< "LV: Interleaving disabled by the pass manager\n");		<< "LV: Interleaving disabled by the pass manager\n");
}		}

void LoopVectorizeHints::setAlreadyVectorized() {		void LoopVectorizeHints::setAlreadyVectorized() {
LLVMContext &Context = TheLoop->getHeader()->getContext();		LLVMContext &Context = TheLoop->getHeader()->getContext();

MDNode *IsVectorizedMD = MDNode::get(		MDNode *IsVectorizedMD = MDNode::get(
Context,		Context,
{MDString::get(Context, "llvm.loop.isvectorized"),		{MDString::get(Context, "llvm.loop.isvectorized"),
ConstantAsMetadata::get(ConstantInt::get(Context, APInt(32, 1)))});		ConstantAsMetadata::get(ConstantInt::get(Context, APInt(32, 1)))});
MDNode *LoopID = TheLoop->getLoopID();		MDNode *LoopID = TheLoop->getLoopID();
MDNode *NewLoopID =		MDNode *NewLoopID =
makePostTransformationMetadata(Context, LoopID,		makePostTransformationMetadata(Context, LoopID,
{Twine(Prefix(), "vectorize.").str(),		{Twine(Prefix(), "vectorize.").str(),
Twine(Prefix(), "interleave.").str()},		Twine(Prefix(), "interleave.").str()},
{IsVectorizedMD});		{IsVectorizedMD});
TheLoop->setLoopID(NewLoopID);		TheLoop->setLoopID(NewLoopID);

// Update internal cache.		// Update internal cache.
IsVectorized.Value = 1;		IsVectorized.Value.U32 = 1;
}		}

bool LoopVectorizeHints::allowVectorization(		bool LoopVectorizeHints::allowVectorization(
Function F, Loop L, bool VectorizeOnlyWhenForced) const {		Function F, Loop L, bool VectorizeOnlyWhenForced) const {
if (getForce() == LoopVectorizeHints::FK_Disabled) {		if (getForce() == LoopVectorizeHints::FK_Disabled) {
LLVM_DEBUG(dbgs() << "LV: Not vectorizing: #pragma vectorize disable.\n");		LLVM_DEBUG(dbgs() << "LV: Not vectorizing: #pragma vectorize disable.\n");
emitRemarkWithHints();		emitRemarkWithHints();
return false;		return false;
Show All 23 Lines	bool LoopVectorizeHints::allowVectorization(

return true;		return true;
}		}

void LoopVectorizeHints::emitRemarkWithHints() const {		void LoopVectorizeHints::emitRemarkWithHints() const {
using namespace ore;		using namespace ore;

ORE.emit([&]() {		ORE.emit([&]() {
if (Force.Value == LoopVectorizeHints::FK_Disabled)		if (Force.Value.U32 == LoopVectorizeHints::FK_Disabled)
return OptimizationRemarkMissed(LV_NAME, "MissedExplicitlyDisabled",		return OptimizationRemarkMissed(LV_NAME, "MissedExplicitlyDisabled",
TheLoop->getStartLoc(),		TheLoop->getStartLoc(),
TheLoop->getHeader())		TheLoop->getHeader())
<< "loop not vectorized: vectorization is explicitly disabled";		<< "loop not vectorized: vectorization is explicitly disabled";
else {		else {
OptimizationRemarkMissed R(LV_NAME, "MissedDetails",		OptimizationRemarkMissed R(LV_NAME, "MissedDetails",
TheLoop->getStartLoc(), TheLoop->getHeader());		TheLoop->getStartLoc(), TheLoop->getHeader());
R << "loop not vectorized";		R << "loop not vectorized";
if (Force.Value == LoopVectorizeHints::FK_Enabled) {		if (Force.Value.U32 == LoopVectorizeHints::FK_Enabled) {
R << " (Force=" << NV("Force", true);		R << " (Force=" << NV("Force", true);
if (Width.Value != 0)		if (Width.Value.EC.isNonZero())
R << ", Vector Width=" << NV("VectorWidth", Width.Value);		R << ", Vector Width=" << NV("VectorWidth", Width.Value.EC);
if (Interleave.Value != 0)		if (Interleave.Value.U32 != 0)
R << ", Interleave Count=" << NV("InterleaveCount", Interleave.Value);		R << ", Interleave Count="
		<< NV("InterleaveCount", Interleave.Value.U32);
R << ")";		R << ")";
}		}
return R;		return R;
}		}
});		});
}		}

const char *LoopVectorizeHints::vectorizeAnalysisPassName() const {		const char *LoopVectorizeHints::vectorizeAnalysisPassName() const {
if (getWidth() == 1)		if (getWidth() == ElementCount::getFixed(1))
return LV_NAME;		return LV_NAME;
if (getForce() == LoopVectorizeHints::FK_Disabled)		if (getForce() == LoopVectorizeHints::FK_Disabled)
return LV_NAME;		return LV_NAME;
if (getForce() == LoopVectorizeHints::FK_Undefined && getWidth() == 0)		if (getForce() == LoopVectorizeHints::FK_Undefined && getWidth().isZero())
return LV_NAME;		return LV_NAME;
return OptimizationRemarkAnalysis::AlwaysPrint;		return OptimizationRemarkAnalysis::AlwaysPrint;
}		}

void LoopVectorizeHints::getHintsFromMetadata() {		void LoopVectorizeHints::getHintsFromMetadata() {
MDNode *LoopID = TheLoop->getLoopID();		MDNode *LoopID = TheLoop->getLoopID();
if (!LoopID)		if (!LoopID)
return;		return;
Show All 24 Lines	for (unsigned i = 1, ie = LoopID->getNumOperands(); i < ie; ++i) {

// Check if the hint starts with the loop metadata prefix.		// Check if the hint starts with the loop metadata prefix.
StringRef Name = S->getString();		StringRef Name = S->getString();
if (Args.size() == 1)		if (Args.size() == 1)
setHint(Name, Args[0]);		setHint(Name, Args[0]);
}		}
}		}

void LoopVectorizeHints::setHint(StringRef Name, Metadata *Arg) {		void LoopVectorizeHints::setHint(StringRef Name, const Metadata *Arg) {
if (!Name.startswith(Prefix()))		if (!Name.startswith(Prefix()))
return;		return;
Name = Name.substr(Prefix().size(), StringRef::npos);		Name = Name.substr(Prefix().size(), StringRef::npos);

const ConstantInt *C = mdconst::dyn_extract<ConstantInt>(Arg);		const ConstantInt *C = mdconst::dyn_extract<ConstantInt>(Arg);
if (!C)		if (!C && Name != Width.Name)
		fhahnUnsubmitted Done Reply Inline Actions `C` is only used in the condition, maybe just check `mdconst::dyn_extract<ConstantInt>(Arg)` in the if? But now that we pass the metadata node to validateAndSet, do we actually need the check here or can we just do all the validation in the `validateAndSet`? Seems like the time-savings by existing a bit earlier would be very minor. fhahn: `C` is only used in the condition, maybe just check `mdconst::dyn_extract<ConstantInt>(Arg)` in…
		david-armAuthorUnsubmitted Done Reply Inline Actions Sure, sounds sensible! david-arm: Sure, sounds sensible!
return;		return;
unsigned Val = C->getZExtValue();

Hint *Hints[] = {&Width, &Interleave, &Force, &IsVectorized, &Predicate};		Hint *Hints[] = {&Width, &Interleave, &Force, &IsVectorized, &Predicate};
for (auto H : Hints) {		for (auto H : Hints) {
if (Name == H->Name) {		if (Name == H->Name) {
if (H->validate(Val))		if (!H->validateAndSet(Arg))
H->Value = Val;
else
LLVM_DEBUG(dbgs() << "LV: ignoring invalid hint '" << Name << "'\n");		LLVM_DEBUG(dbgs() << "LV: ignoring invalid hint '" << Name << "'\n");
break;		break;
}		}
}		}
}		}

bool LoopVectorizationRequirements::doesNotMeet(		bool LoopVectorizationRequirements::doesNotMeet(
Function F, Loop L, const LoopVectorizeHints &Hints) {		Function F, Loop L, const LoopVectorizeHints &Hints) {
▲ Show 20 Lines • Show All 1,048 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,254 Lines • ▼ Show 20 Lines	static bool processLoopInVPlanNativePath(
LoopVectorizationCostModel CM(SEL, L, PSE, LI, LVL, *TTI, TLI, DB, AC, ORE, F,		LoopVectorizationCostModel CM(SEL, L, PSE, LI, LVL, *TTI, TLI, DB, AC, ORE, F,
&Hints, IAI);		&Hints, IAI);
// Use the planner for outer loop vectorization.		// Use the planner for outer loop vectorization.
// TODO: CM is not used at this point inside the planner. Turn CM into an		// TODO: CM is not used at this point inside the planner. Turn CM into an
// optional argument if we don't need it in the future.		// optional argument if we don't need it in the future.
LoopVectorizationPlanner LVP(L, LI, TLI, TTI, LVL, CM, IAI, PSE);		LoopVectorizationPlanner LVP(L, LI, TLI, TTI, LVL, CM, IAI, PSE);

// Get user vectorization factor.		// Get user vectorization factor.
const unsigned UserVF = Hints.getWidth();		const ElementCount UserVF = Hints.getWidth();

// Plan how to best vectorize, return the best VF and its cost.		// Plan how to best vectorize, return the best VF and its cost.
const VectorizationFactor VF =		const VectorizationFactor VF = LVP.planInVPlanNativePath(UserVF);
LVP.planInVPlanNativePath(ElementCount::getFixed(UserVF));

// If we are stress testing VPlan builds, do not attempt to generate vector		// If we are stress testing VPlan builds, do not attempt to generate vector
// code. Masked vector code generation support will follow soon.		// code. Masked vector code generation support will follow soon.
// Also, do not attempt to vectorize if no vector code will be produced.		// Also, do not attempt to vectorize if no vector code will be produced.
if (VPlanBuildStressTest \|\| EnableVPlanPredication \|\|		if (VPlanBuildStressTest \|\| EnableVPlanPredication \|\|
VectorizationFactor::Disabled() == VF)		VectorizationFactor::Disabled() == VF)
return false;		return false;

▲ Show 20 Lines • Show All 145 Lines • ▼ Show 20 Lines	#endif /* NDEBUG */
LoopVectorizationCostModel CM(SEL, L, PSE, LI, &LVL, *TTI, TLI, DB, AC, ORE,		LoopVectorizationCostModel CM(SEL, L, PSE, LI, &LVL, *TTI, TLI, DB, AC, ORE,
F, &Hints, IAI);		F, &Hints, IAI);
CM.collectValuesToIgnore();		CM.collectValuesToIgnore();

// Use the planner for vectorization.		// Use the planner for vectorization.
LoopVectorizationPlanner LVP(L, LI, TLI, TTI, &LVL, CM, IAI, PSE);		LoopVectorizationPlanner LVP(L, LI, TLI, TTI, &LVL, CM, IAI, PSE);

// Get user vectorization factor and interleave count.		// Get user vectorization factor and interleave count.
unsigned UserVF = Hints.getWidth();		ElementCount UserVF = Hints.getWidth();
unsigned UserIC = Hints.getInterleave();		unsigned UserIC = Hints.getInterleave();

// Plan how to best vectorize, return the best VF and its cost.		// Plan how to best vectorize, return the best VF and its cost.
Optional<VectorizationFactor> MaybeVF =		Optional<VectorizationFactor> MaybeVF = LVP.plan(UserVF, UserIC);
LVP.plan(ElementCount::getFixed(UserVF), UserIC);

VectorizationFactor VF = VectorizationFactor::Disabled();		VectorizationFactor VF = VectorizationFactor::Disabled();
unsigned IC = 1;		unsigned IC = 1;

if (MaybeVF) {		if (MaybeVF) {
VF = *MaybeVF;		VF = *MaybeVF;
// Select the interleave count.		// Select the interleave count.
IC = CM.selectInterleaveCount(VF.Width, VF.Cost);		IC = CM.selectInterleaveCount(VF.Width, VF.Cost);
▲ Show 20 Lines • Show All 263 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/metadata-width.ll

Show All 18 Lines	for.body: ; preds = %entry, %for.body
%lftr.wideiv = trunc i64 %indvars.iv.next to i32		%lftr.wideiv = trunc i64 %indvars.iv.next to i32
%exitcond = icmp eq i32 %lftr.wideiv, %n		%exitcond = icmp eq i32 %lftr.wideiv, %n
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0		br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0

for.end: ; preds = %for.body, %entry		for.end: ; preds = %for.body, %entry
ret void		ret void
}		}

		; CHECK-LABEL: @test2(
		; CHECK: store <8 x i32>
		; CHECK: ret void
		define void @test2(i32* nocapture %a, i32 %n) #0 {
		entry:
		%cmp4 = icmp sgt i32 %n, 0
		br i1 %cmp4, label %for.body, label %for.end

		for.body: ; preds = %entry, %for.body
		%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
		%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
		%0 = trunc i64 %indvars.iv to i32
		store i32 %0, i32* %arrayidx, align 4
		%indvars.iv.next = add i64 %indvars.iv, 1
		%lftr.wideiv = trunc i64 %indvars.iv.next to i32
		%exitcond = icmp eq i32 %lftr.wideiv, %n
		br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !2

		for.end: ; preds = %for.body, %entry
		ret void
		}

attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "frame-pointer"="none" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "unsafe-fp-math"="false" "use-soft-float"="false" }		attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "frame-pointer"="none" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "unsafe-fp-math"="false" "use-soft-float"="false" }

!0 = !{!0, !1}		!0 = !{!0, !1}
!1 = !{!"llvm.loop.vectorize.width", i32 8}		!1 = !{!"llvm.loop.vectorize.width", i32 8}
		!2 = !{!2, !3}
		!3 = !{!"llvm.loop.vectorize.width", !4}
		!4 = !{i32 8, i32 0}
		sdesmalenUnsubmitted Done Reply Inline Actions nit: `s/i32 0/i1 false/` sdesmalen: nit: `s/i32 0/i1 false/`
		david-armAuthorUnsubmitted Done Reply Inline Actions If I make this change the test fails. We will still always print out "i32 0". Is there a way to control the format generated? david-arm: If I make this change the test fails. We will still always print out "i32 0". Is there a way to…
		david-armAuthorUnsubmitted Done Reply Inline Actions Sorry I just realised this is an input to the test, rather than testing the output! Please ignore my comment. david-arm: Sorry I just realised this is an input to the test, rather than testing the output! Please…
		fhahnUnsubmitted Done Reply Inline Actions also needs a test which actually sets `isScalable` to `true`? fhahn: also needs a test which actually sets `isScalable` to `true`?
		david-armAuthorUnsubmitted Not Done Reply Inline Actions Unfortunately I can't do that at the moment because the vectorizer crashes when trying to perform scalable vectorisation. This is why I only tested the 'false' case. In theory I could create a negative test that tests the vectoriser crashes, which we could remove later once we support vectorisation. Any thoughts @fhahn or @sdesmalen ? david-arm: Unfortunately I can't do that at the moment because the vectorizer crashes when trying to…
		sdesmalenUnsubmitted Done Reply Inline Actions The proof of concept for scalable loop vectorization (D90343) depends on this patch and shows the `isScalable = true` case does something sensible with this information. Hopefully that is sufficient for now while we work on actually implementing scalable auto-vec support in the LoopVectorizer. sdesmalen: The proof of concept for scalable loop vectorization (D90343) depends on this patch and shows…
		fhahnUnsubmitted Done Reply Inline Actions Hm, I am a bit reluctant to add something to LangRef which causes passes to crash on valid IR. Would it be possible to either convert the scalable VF into a fixed one (replacing n with 1 should be valid?), or just bail out on a scalable VF, without too much effort? fhahn: Hm, I am a bit reluctant to add something to LangRef which causes passes to crash on valid IR.
		sdesmalenUnsubmitted Done Reply Inline Actions Hm, I am a bit reluctant to add something to LangRef which causes passes to crash on valid IR. Would it be possible to either convert the scalable VF into a fixed one (replacing n with 1 should be valid?), or just bail out on a scalable VF, without too much effort? That would lead to a bit of a chicken/egg problem, because we're adding this feature so we can incrementally make the loop vectorizer cope with scalable vectors. There are currently too many code-paths in there to fix up in one go. D90343 uses this metadata to enable vectorization of an individual loop with scalable vectors and gives us a route to enable and test individual loops in unit tests or in existing code (using the Clang attribute in D89031). The fact that the vectorizer currently crashes on most inputs, is a bug, it just happens to be a bug we know about. It is also a temporary broken state that we will want fix as soon as possible. Is it not possible to add a line to the LangRef saying that this is experimental until the loop vectorizer supports scalable vectors? (and we remove that line when the vectorizer is stable enough). sdesmalen: > Hm, I am a bit reluctant to add something to LangRef which causes passes to crash on valid IR.
		fhahnUnsubmitted Done Reply Inline Actions That would lead to a bit of a chicken/egg problem, because we're adding this feature so we can incrementally make the loop vectorizer cope with scalable vectors. There are currently too many code-paths in there to fix up in one go. D90343 uses this metadata to enable vectorization of an individual loop with scalable vectors and gives us a route to enable and test individual loops in unit tests or in existing code (using the Clang attribute in D89031). The fact that the vectorizer currently crashes on most inputs, is a bug, it just happens to be a bug we know about. It is also a temporary broken state that we will want fix as soon as possible. That makes sense, my suggestion was to just convert the scalable user VF to a fixed one after getting the hint instead of crashing. As in // Get user vectorization factor and interleave count. ElementCount UserVF = Hints.getWidth(); + if (UserVF.isScalable()) { + // TODO: Use scalable user VF, once LV is ready. For now, just assume n == 1. + UserVF = ElementCount::getFixed(UserVF.getKnownMinValue()); + } unsigned UserIC = Hints.getInterleave(); Only `processLoopInVPlanNativePath` and `processLoop` would need to be updated I think. as we are free to drop the hints, I think that would be preferable to crashing and should be legal? fhahn: > That would lead to a bit of a chicken/egg problem, because we're adding this feature so we…
		sdesmalenUnsubmitted Done Reply Inline Actions Okay, so you mean assuming Fixed for now and then re-enable this in e.g. D91077 (which adds scalable vector support for some simple loop, but which doesn't yet guarantee that it won't crash on all inputs). Yes, I think that makes sense. sdesmalen: Okay, so you mean assuming Fixed for now and then re-enable this in e.g. D91077 (which adds…
		fhahnUnsubmitted Done Reply Inline Actions +1 fhahn: +1

This is an archive of the discontinued LLVM Phabricator instance.

[SVE] Add support for scalable vectors with vectorize.scalable.enable loop attributeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 298980

llvm/docs/LangRef.rst

llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

llvm/test/Transforms/LoopVectorize/metadata-width.ll

[SVE] Add support for scalable vectors with vectorize.scalable.enable loop attribute
ClosedPublic