Page MenuHomePhabricator

[SVE] Add support to vectorize_width loop pragma for scalable vectors
Needs ReviewPublic

Authored by david-arm on Oct 8 2020, 3:00 AM.

Details

Summary

This patch adds support for an optional second parameter passed to
the vectorize_width pragma, which indicates if the user wishes
to use fixed width or scalable vectorization. For example the user
can now write something like:

#pragma clang loop vectorize_width(4, fixed)

or

#pragma clang loop vectorize_width(4, scalable)

I have added a new 'scalable_numeric' state to the LoopHintAttr class
to indicate whether the numeric vectorization width is scalable or
not. When generating IR we make use of the new format for the
llvm.loop.vectorize.width attribute that allows us to effectively pass
an ElementCount that contains the vectorization factor and a scalable
flag. If the user tries to enable scalable vectorization for targets
that do not support it a warning is emitted and the scalable flag is
ignored.

Tests were added to

clang/test/CodeGenCXX/pragma-loop.cpp
clang/test/CodeGenCXX/pragma-scalable-loop.cpp

for both the 'fixed' and 'scalable' optional parameter.

Diff Detail

Event Timeline

david-arm created this revision.Oct 8 2020, 3:00 AM
Herald added a project: Restricted Project. · View Herald Transcript
fhahn added inline comments.Oct 15 2020, 1:56 PM
clang/lib/Parse/ParsePragma.cpp
1098

nit: unrelated change?

clang/lib/Sema/SemaStmtAttr.cpp
145

Is there a way to only accept fixed_width/scalable for targets that support it? Not sure if we have enough information here, but we might be able to reject it eg per target basis or something

david-arm marked an inline comment as done.Oct 19 2020, 5:16 AM
david-arm added inline comments.
clang/lib/Sema/SemaStmtAttr.cpp
145

Hi @fhahn, I think if possible we'd prefer not to reject scalable vectors at this point. Theoretically there is no reason why we can't perform scalable vectorisation for targets that don't have hardware support for scalable vectors. In this case it simply means that vscale is 1. If you want we could add some kind of opt-remark in the vectoriser that says something like "target does not support scalable vectors, vectorising for vscale=1"?

sdesmalen accepted this revision.Oct 30 2020, 8:42 AM

LGTM

clang/lib/Sema/SemaStmtAttr.cpp
145

I agree with @david-arm that we shouldn't prevent this in the front-end. Even if the architecture may not support scalable vectors natively, there may still be reasons to want to create scalable vectors in IR, for example to have more portable IR.

clang/test/CodeGenCXX/pragma-loop.cpp
167

out of curiosity, is there a particular reason you're testing it with a do-while loop instead of a shorter for-loop like the tests in for_template_constant_expression_test ?

This revision is now accepted and ready to land.Oct 30 2020, 8:42 AM
fhahn added inline comments.Oct 30 2020, 8:56 AM
clang/lib/Sema/SemaStmtAttr.cpp
145

Hm, I am just a bit worried that it might be a bit confusing to users that do not know what scalable vectors are (it is obvious when knowing all about SVE, but I would assume most people don't necessarily know what this means). I guess that is not the biggest deal, as long vectorize_width(X, scalable) works for every target.

Even if the architecture may not support scalable vectors natively, there may still be reasons to want to create scalable vectors in IR, for example to have more portable IR.

Sure, but there are so many other target-specific things encoded that make the IR really un-portable between targets. Granted, it is not impossible to convert IR between some architectures (as in arm64_32)

sdesmalen requested changes to this revision.Tue, Nov 3, 1:18 AM
sdesmalen added inline comments.
clang/docs/LanguageExtensions.rst
3049

Can you add a comment saying that the use of "scalable" is still experimental and is currently only intended to work for targets that support scalable vectors?

clang/lib/Sema/SemaStmtAttr.cpp
145

Sorry, forgot to reply to this.

Hm, I am just a bit worried that it might be a bit confusing to users that do not know what scalable vectors are (it is obvious when knowing all about SVE, but I would assume most people don't necessarily know what this means). I guess that is not the biggest deal, as long vectorize_width(X, scalable) works for every target.

At the moment this feature is still experimental, so I don't think any target would be able to return true to the question if this is supported :) That said, I agree that the compiler shouldn't crash for other targets after support in the loop-vectorizer stops being experimental. So I'm changing my mind here, and am happy to go with your suggestion to ignore the flag for other targets. When some default mechanism is added to lower scalable vectors to fixed-width vectors (for targets that don't natively support them), this check can probably be removed. @david-arm can you add some target hook to ignore the hint?

Sure, but there are so many other target-specific things encoded that make the IR really un-portable between targets. Granted, it is not impossible to convert IR between some architectures (as in arm64_32)

I didn't mean portable between targets, but more as keeping the length of the vector agnostic in IR and leaving it until code-generation to pick a suitable/available vector extension, so that the same IR could be used to generate code for Neon or 256bit SVE for example. This is more a hypothetical use-case at the moment though.

This revision now requires changes to proceed.Tue, Nov 3, 1:18 AM
david-arm updated this revision to Diff 302773.Wed, Nov 4, 1:20 AM
david-arm marked an inline comment as done.
david-arm edited the summary of this revision. (Show Details)
sdesmalen added inline comments.Wed, Nov 4, 7:44 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
939

From what I can see, the vectorize_width flag is not ignored, only the scalable property is. That means this should be:

'scalable' not supported by the target so assuming 'fixed' instead.
clang/lib/Sema/SemaStmtAttr.cpp
148

If the target does not support scalable vectors, it currently assumes "fixed". If we want to stick with that approach, the diagnostic message should be changed (see my other comment). The alternative is dropping the hint entirely by returning nullptr and changing the diagnostic message to say the hint is ignored. I could live with both options. @fhahn do you have a preference here?

nit: to reduce nesting, can you hoist this out one level, e.g.

if (StateLoc && StateLoc->Ident & ...)
  State = LoopHintAttr::ScalableNumeric;
else
  State = LoopHintAttr::Numeric;

if (State == LoopHintAttr::ScalableNumeric &&
    !S.Context.getTargetInfo().supportsScalableVectors()) {
  S.Diag(....);
  State = LoopHintAttr::Numeric;
}

I'll hold off on any more changes for now to give @fhahn a chance to reply to your comment @sdesmalen about the fallback behaviour when scalable vectorisation is unsupported.

clang/include/clang/Basic/DiagnosticSemaKinds.td
939

OK. I guess it's just when the warning comes out it appears at the start of the line so I wanted to emphasise that this relates to the scalable property passed to the vectorize_width attribute (rather than other attributes) as there could potentially be several pragmas on one line. I think it would be good to mention the vectorize_width pragma/attribute somewhere in the warning message to make it clear. I'll see if I can reword it.

This comment was removed by sdesmalen.
david-arm updated this revision to Diff 304488.Wed, Nov 11, 5:16 AM
david-arm marked an inline comment as done.
c-rhodes added inline comments.Wed, Nov 11, 5:57 AM
clang/docs/LanguageExtensions.rst
3032

nit: s/__value__/_value_/

I am very sorry that I am late to this... but I do have some concerns.

The concern that I have is that we extend vecorize_width with a scalable/fixed boolean, but there are more vectorisation pragma that set vectorisation options which imply enabling vectorisation:

Pragmas setting transformation options imply the transformation is enabled, as if it was enabled via the corresponding transformation pragma (e.g. vectorize(enable))

Thus, unless I miss something, I don't think this should be an option to vectorize_width, but to me it looks like we need a separate one, e.g.:

vectorize_scalable(enable|disable)

what do you think?

Hi @SjoerdMeijer I think that given we now support scalable vectors we thought it made sense to be able to specify whether the user wants 'fixed' or 'scalable' vectorisation along with the vector width, although without specifying the additional property the default continues to remain 'fixed'. However, what you said about having a vectorize_scalable pragma is correct and we are intending to also add a pragma like this in a future patch.

Hi @SjoerdMeijer I think that given we now support scalable vectors we thought it made sense to be able to specify whether the user wants 'fixed' or 'scalable' vectorisation along with the vector width, although without specifying the additional property the default continues to remain 'fixed'. However, what you said about having a vectorize_scalable pragma is correct and we are intending to also add a pragma like this in a future patch.

Okay, I haven't looked at the implementation to be honest, but am just trying to understand the different use cases of this first.
I just seem to be missing or not understanding why fixed/scalable is an option to only vectorize_width, why not to vectorize(enable) or just a separate one like vectorize_scalable? By making scalable/fixed and option to vectorize_width, you can't toggle this for other pragmas like interleave(enable) that enable vectorisation, which would be inconsistent? It also seems to be more work to me to do this first for vectorize_width, and then fix up other pragmas later. But I might be missing something (obivous) here.

Hi @SjoerdMeijer I think that given we now support scalable vectors we thought it made sense to be able to specify whether the user wants 'fixed' or 'scalable' vectorisation along with the vector width, although without specifying the additional property the default continues to remain 'fixed'. However, what you said about having a vectorize_scalable pragma is correct and we are intending to also add a pragma like this in a future patch.

Okay, I haven't looked at the implementation to be honest, but am just trying to understand the different use cases of this first.
I just seem to be missing or not understanding why fixed/scalable is an option to only vectorize_width, why not to vectorize(enable) or just a separate one like vectorize_scalable? By making scalable/fixed and option to vectorize_width, you can't toggle this for other pragmas like interleave(enable) that enable vectorisation, which would be inconsistent? It also seems to be more work to me to do this first for vectorize_width, and then fix up other pragmas later. But I might be missing something (obivous) here.

Hi @SjoerdMeijer, all valid and good questions. We think it makes sense to allow specifying explicitly what the meaning of '4' is when specifying the width. So that vectorize_width(4, fixed) means vectorizing with <4 x eltty> and vectorize_width(4, scalable) means vectorizing with <vscale x 4 x eltty>. Like @david-arm said, we also plan to add something like vectorize_style(fixed|scalable). This approach should be fully complementary to vectorize_with so that it would be possible to have:

// Use scalable vectors, but leave it to the cost-model to choose the most efficient N in <vscale x N x eltty>.
// If the pragma is not specified, it defaults to vectorize_style(fixed).
#pragma clang loop vectorize_style(scalable)

// Use <4 x eltty>
#pragma clang loop vectorize_width(4, fixed)

// Use <vscale x 4 x eltty>
#pragma clang loop vectorize_width(4, scalable)

// If vectorize_style(scalable) is specified, then use <vscale x 4 x eltty>, otherwise <4 x eltty>
#pragma clang loop vectorize_width(4)                           // uses <4 x eltty>
#pragma clang loop vectorize_width(4) vectorize_style(scalable) // uses <vscale x 4 x eltty>

// Conflicting options, clang should print diagnostic and error or ignore the hint.
#pragma clang loop vectorize_width(4, fixed) vectorize_style(scalable)

I hope that gives a bit more context.

This approach should be fully complementary to vectorize_with so that it would be possible to have:

// Use scalable vectors, but leave it to the cost-model to choose the most efficient N in <vscale x N x eltty>.
// If the pragma is not specified, it defaults to vectorize_style(fixed).
#pragma clang loop vectorize_style(scalable)

// Use <4 x eltty>
#pragma clang loop vectorize_width(4, fixed)

// Use <vscale x 4 x eltty>
#pragma clang loop vectorize_width(4, scalable)

// If vectorize_style(scalable) is specified, then use <vscale x 4 x eltty>, otherwise <4 x eltty>
#pragma clang loop vectorize_width(4)                           // uses <4 x eltty>
#pragma clang loop vectorize_width(4) vectorize_style(scalable) // uses <vscale x 4 x eltty>

// Conflicting options, clang should print diagnostic and error or ignore the hint.
#pragma clang loop vectorize_width(4, fixed) vectorize_style(scalable)

I hope that gives a bit more context.

Ok, thanks for clarifying that!

If:

// Use <vscale x 4 x eltty>
#pragma clang loop vectorize_width(4, scalable)

is equivalent to:

// uses <vscale x 4 x eltty>
#pragma clang loop vectorize_width(4) vectorize_style(scalable)

then I think that illustrates that I don't see the point of extending vectorize_width because we still can't express scalable vectorisation for:

// <VF x eltty>
#pragma clang loop vectorize_predicate(enable)

and also for interleave_count(4)?

Again, when the idea is to have vectorize_style anyway, wouldn't it be easier not to bother extending vectorize_width and just go for vectorize_style? It allows you to specify fixed/scalable vectorisation in one way, and avoids having conflicting options.

The other thing I thought about: this is extending an existing user-facing pragma, and notifying the list would probably be best thing to do.

As I see it there are a bunch of pragmas that all enable vectorisation, with each pragma providing a unit of information. One component of this information is the vectorisation factor hint provided by vectorize_width.

With the introduction of scalable vectors this hint is using the wrong datatype and thus needs to be updated to allow vectorize_width(#num,[fixed|scalable]) and vectorize_width([fixed|scalable]) along side the existing vectorize_width(#num) representation that effectively becomes an alias to vectorize_width(#num, fixed).

Doing this means all existing usages work as expected and there's now extra power to better guide the chosen vectorisation factor.

SjoerdMeijer added a comment.EditedWed, Nov 18, 8:14 AM

Because I was not understanding, we have discussed this further offline.

I think the conclusion was: pragma vectorize_width controls the vectorisation vector VF in <vscale x VF x elty>. where vscale is not just a separate thing but it defines a VectorType. That's why it would make sense to attach scalable|fixed to vectorize_width. I agree with this and seems reasonable.

I still don't see how the proposed extension here allows you to specify fixed width vectorisation for:

#pragma clang loop interleave_count(4)

targeting SVE and would appreciate if someone can comment on this example, but I won't be holding up this work anymore as this might be addressed later.

Thanks @david-arm for posting this proposal to the cfe list.
My confusion has been cleared up. The (new) proposal is to have:

  1. vectorize_width(X) where X is an integer.
  2. vectorize_width(X, fixed|scalable)
  3. vectorize_width(fixed|scalable)

And with that 3rd option I agree that this allows us to express everything we want, and this patch needs adapting to this new proposal (just stating the obvious for clarity/completeness)

If scalable is used on a target that doesn't support this, a warning and falling back to fixed seems like the right thing to do.

I did have concerns about this, similarly like @fhahn:

Hm, I am just a bit worried that it might be a bit confusing to users that do not know what scalable vectors are (it is obvious when knowing all about SVE, but I would assume most people don't necessarily know what this means). I guess that is not the biggest deal, as long vectorize_width(X, scalable) works for every target.

But since the new scalable option is opt-in, people don't need know about this if they don't want/need to, this should (hopefully) not be the case.