This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Allow "vector_size" applied to Booleans
AbandonedPublic

Authored by simoll on Jun 3 2020, 4:29 AM.

Details

Summary

This is the vector_size alternative to D88905.

This patch extends Clang to allow 'bool' as a valid vector element type (attribute vector_size).

This is intended as the canonical type for SIMD masks and facilitates clean vector intrinsic declarations.
Vectors of i1 are supported on IR level and below down to many SIMD ISAs, such as AVX512, ARM SVE (fixed vector length) and the VE target (NEC SX-Aurora TSUBASA).

The RFC on cfe-dev: https://lists.llvm.org/pipermail/cfe-dev/2020-May/065434.html

Diff Detail

Event Timeline

simoll created this revision.Jun 3 2020, 4:29 AM
simoll updated this revision to Diff 268140.Jun 3 2020, 4:36 AM

NFC. Undid stray change.

simoll marked 2 inline comments as done.Jun 4 2020, 2:56 AM
simoll added inline comments.
clang/lib/CodeGen/CGExprScalar.cpp
4707

Accidental change. Will undo.

clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
590 ↗(On Diff #268140)

This is an old comment that is no longer helpful. Will remove.

lenary added a subscriber: lenary.Jun 4 2020, 7:04 AM
lenary added inline comments.
clang/docs/LanguageExtensions.rst
473

It would be nice if this aside about non-bool vectors was more prominently displayed - it's something I hadn't realised before.

simoll marked an inline comment as done.Jun 4 2020, 7:08 AM
simoll added inline comments.
clang/docs/LanguageExtensions.rst
473

Yep. that caught me by surprise too. I'll move that sentence to the paragraph about GCC vectors above.

simoll updated this revision to Diff 268488.Jun 4 2020, 8:28 AM
  • Improved documentation for the size argument of vector_size.
  • Rebased
lenary removed a subscriber: lenary.Jun 9 2020, 2:51 AM
simoll updated this revision to Diff 281322.Jul 28 2020, 12:06 PM
  • Rebased
  • Implement EmitFromMemory
simoll updated this revision to Diff 281844.Jul 30 2020, 2:19 AM

Updated clang test

simoll updated this revision to Diff 281878.Jul 30 2020, 4:41 AM

NFC

  • Style updates
  • Rebased
kaz7 added a comment.Jul 31 2020, 2:24 AM

Thank you for implementing EmitFromMemory. We are locally trying to use this patch to implement vector mask intrinsic instructions on Aurora VE. It is working well with regression tests. We will try test-suite next.

rsandifo-arm added inline comments.
clang/docs/LanguageExtensions.rst
473

Sorry for the extremely late comment. Like @lenary I hadn't thought about this. I'd assumed that the vector woiuld still be a multiple of 8 bits in size, but I agree that's probably too restrictive to be the only option available.

In that case, would it make sense to add a separate attribute instead? I think it's too surprising to change the units of the existing attribute based on the element type. Perhaps we should even make it take two parameters: the total number of elements, and the number of bits per element. That might be more natural for some AVX and SVE combinations. We wouldn't need to supporrt all combinations from the outset, it's just a question whether we should make the syntax general enough to support it in future.

Perhaps we could do both: support vector_size for bool using byte sizes (and not allowing subbyte vector lengths), and add a new, more general attribute that allows subbyte lengths and explicit subbyte element sizes.

simoll added inline comments.Aug 4 2020, 5:40 AM
clang/docs/LanguageExtensions.rst
473

In that case, would it make sense to add a separate attribute instead? I think it's too surprising to change the units of the existing attribute based on the element type. Perhaps we should even make it take two parameters: the total number of elements, and the number of bits per element. That might be more natural for some AVX and SVE combinations. We wouldn't need to supporrt all combinations from the outset, it's just a question whether we should make the syntax general enough to support it in future.

I guess adding a new attribute makes sense mid to long term. For now, i'd want something that just does the job... ie, what is proposed here. We should absolutely document the semantics of vector_size properly.. it already is counterintuitive (broken, if you ask me).

Perhaps we could do both: support vector_size for bool using byte sizes (and not allowing subbyte vector lengths), and add a new, more general attribute that allows subbyte lengths and explicit subbyte element sizes.

Disallowing subbyte bool vectors actually makes this more complicated because these types are produced implicitly by compares of (legal) vector types. Consider this:

typedef int int3 __attribute__((vector_size(3 * sizeof(int))));
int3 A = ...;
int3 B = ...;
auto Z = A < B; // vector compare yielding a `bool vector_size(3)`-typed value.

Regarding your proposal for a separate subbyte element size and subbyte length, that may or may not make sense but it is surely something that should be discussed more broadly with its own proposal.

rsandifo-arm added inline comments.Aug 4 2020, 6:23 AM
clang/docs/LanguageExtensions.rst
473

Perhaps we could do both: support vector_size for bool using byte sizes (and not allowing subbyte vector lengths), and add a new, more general attribute that allows subbyte lengths and explicit subbyte element sizes.

Disallowing subbyte bool vectors actually makes this more complicated because these types are produced implicitly by compares of (legal) vector types. Consider this:

typedef int int3 __attribute__((vector_size(3 * sizeof(int))));
int3 A = ...;
int3 B = ...;
auto Z = A < B; // vector compare yielding a `bool vector_size(3)`-typed value.

Yeah, I understand the need for some way of creating subbyte vectors. I'm just not sure using the existing vector_size attribute would be the best choice for that case. I.e. the objection wasn't based on “keeping things simple” but more “keeping things consistent“.

That doesn't mean that the above code should be invalid. It just means that it wouldn't be possible to write the type of Z using the existing vector_size attribute.

(FWIW, vector_size was originally a GNUism and GCC stil requires vector sizes to be a power of 2, but I realise that isn't relevant here. And the same principle applies with s/3/4 in the above example anyway.)

Regarding your proposal for a separate subbyte element size and subbyte length, that may or may not make sense but it is surely something that should be discussed more broadly with its own proposal.

Yeah, I agree any new attribute would need to be discussed more widely.

simoll added inline comments.Aug 4 2020, 7:08 AM
clang/docs/LanguageExtensions.rst
473

Perhaps we could do both: support vector_size for bool using byte sizes (and not allowing subbyte vector lengths), and add a new, more general attribute that allows subbyte lengths and explicit subbyte element sizes.

Disallowing subbyte bool vectors actually makes this more complicated because these types are produced implicitly by compares of (legal) vector types. Consider this:

typedef int int3 __attribute__((vector_size(3 * sizeof(int))));
int3 A = ...;
int3 B = ...;
auto Z = A < B; // vector compare yielding a `bool vector_size(3)`-typed value.

Yeah, I understand the need for some way of creating subbyte vectors. I'm just not sure using the existing vector_size attribute would be the best choice for that case. I.e. the objection wasn't based on “keeping things simple” but more “keeping things consistent“.

That doesn't mean that the above code should be invalid. It just means that it wouldn't be possible to write the type of Z using the existing vector_size attribute.

IMHO being able to spell out every type ranks higher than being consistent with regards to a non-standard extension. That is, you could not do the assignment of A < B in C because there is no way to specify the type without auto or other C++ machinery.

(FWIW, vector_size was originally a GNUism and GCC stil requires vector sizes to be a power of 2, but I realise that isn't relevant here. And the same principle applies with s/3/4 in the above example anyway.)

Right, i overlooked the power-of-two constraint.

How much of a blocker are the subbyte sizes and the power-of-two constraint to you? I am asking because vector_size with those constraints would be good enough for us. Keeping the patch as it is mostly makes this extension potentially more useful to other SIMD/Vector users (and of course saves the work of changing it).
We could still lift that constraint (or switch to a new attribute) should the need arise.

rsandifo-arm added inline comments.Aug 4 2020, 9:24 AM
clang/docs/LanguageExtensions.rst
473

How much of a blocker are the subbyte sizes and the power-of-two constraint to you? I am asking because vector_size with those constraints would be good enough for us. Keeping the patch as it is mostly makes this extension potentially more useful to other SIMD/Vector users (and of course saves the work of changing it).
We could still lift that constraint (or switch to a new attribute) should the need arise.

The non-power-of-2 thing seems fine. It's simply removing a constraint and giving non-power-of-2 sizes their obvious meaning.

But I think changing the units in the vector_size is too surprising. I think a good indication is that we'd never have designed it like that if we were adding vector_size now. I think it's something that would often trip users up and that we'd have to keep explaining away.

It's just a personal opinion though.

simoll added inline comments.Aug 4 2020, 9:48 AM
clang/docs/LanguageExtensions.rst
473

But I think changing the units in the vector_size is too surprising. I think a good indication is that we'd never have designed it like that if we were adding vector_size now. I think it's something that would often trip users up and that we'd have to keep explaining away.

Ok. To make sure we are talking about the same thing here, you are suggesting:

typedef bool bool16 __attribute__((vector_size(2)));

Would be a vector of 16 bits stored in two bytes. Correct? If so, that's fine with me and i'll change the patch right away.

rsandifo-arm added inline comments.Aug 4 2020, 9:57 AM
clang/docs/LanguageExtensions.rst
473

But I think changing the units in the vector_size is too surprising. I think a good indication is that we'd never have designed it like that if we were adding vector_size now. I think it's something that would often trip users up and that we'd have to keep explaining away.

Ok. To make sure we are talking about the same thing here, you are suggesting:

typedef bool bool16 __attribute__((vector_size(2)));

Would be a vector of 16 bits stored in two bytes. Correct?

Yeah. And if anyone wants something like a vector of 4 bools in a 4-bit vector (which would certainly be reasonable in principle), we could leave that to some future, more general attribute syntax that doesn't have the same historical baggage as vector_size.

simoll planned changes to this revision.Aug 4 2020, 10:11 AM

Thanks for the feedback!
Planned change: interpret the N in vector_size(N) as the number of bytes in the type, such that, for example, a vector_size(2) bool vector holds 16 bits

simoll updated this revision to Diff 283632.EditedAug 6 2020, 9:04 AM
  • Fixed debug info representation for bool vectors (and added debug info test).
  • Interpret 8*N with the N of vector_size(N) as the numbers of bool elements of the vector.
simoll updated this revision to Diff 283636.Aug 6 2020, 9:18 AM

NFC. Cleanup.

simoll updated this revision to Diff 285282.Aug 13 2020, 1:28 AM
  • Fixed type printing & added type printing test.
  • Rebased.
kaz7 added a comment.Aug 19 2020, 5:54 PM

Thank you preparing this i1 patch and doing it on clang side only. We were testing this patch and sending problem reports. Now, we can use this patch without modifying llvm code. We can define vector mask types like below. Then, we can define intrinsic functions using this type. Those are converted to mask type, e.g v256i1, in the back end without adding bitcasts in ISelLowering.cpp. It helps developers very much. I hope this extension is accepted.

typedef double __vr __attribute__((__vector_size__(2048)));
#if __STDC_VERSION__ >= 199901L
// For C99
typedef _Bool __vm    __attribute__((__vector_size__(32)));
typedef _Bool __vm256 __attribute__((__vector_size__(32)));
typedef _Bool __vm512 __attribute__((__vector_size__(64)));
#else
#ifdef __cplusplus
// For C++
typedef bool __vm    __attribute__((__vector_size__(32)));
typedef bool __vm256 __attribute__((__vector_size__(32)));
typedef bool __vm512 __attribute__((__vector_size__(64)));
#else
#error need C++ or C99 to use vector intrinsics for VE
#endif
#endif

I'm not qualified to review the CodeGen stuff (or accept the patch, obvs. :-)) but FWIW, here are some comments on the doc and Sema side.

It might be good to have more Sema tests for valid and invalid usage, e.g. for which operators are valid and which aren't.

Thanks again for doing this btw.

clang/docs/LanguageExtensions.rst
459

Maybe “Unlike GCC,”? Not much in it though.

489

~ is listed twice. I guess “operators” (without qualification) might make people think of the C++ keyword, in which case assignment and [] are allowed too.

492

It might be worth clarifying this. With the alignment referring specifically to “integer type”, I wasn't sure what something like:

bool __attribute__((vector_size(256)))

would mean on targets that don't provide 256-byte integer types. Is the type still 256-byte aligned?

clang/include/clang/AST/Type.h
3248

Maybe isGenericBooleanVector(), so that the terminology is consistent? Just a suggestion though, not sure if it's an improvement.

clang/lib/Sema/SemaExpr.cpp
9775

Is this deliberately wider than isVectorSizeBoolean(), or does it amount to the same thing?

11914

In practice, won't this either be a no-op (e.g. for 4-byte-per-bool targets) or always trump the fallback CharTy case? I wasn't sure why the case was needed.

11941

Seems like it might be useful to support comparisons too (with false < true, as for scalars). E.g. I guess x == y would otherwise need to be written ~(x ^ y), which seems less readable. Supporting more operators would also help in templated contexts.

clang/lib/Sema/SemaExprCXX.cpp
6271

Any reason not to support this for booleans? ?: seems like a natural operation for all vector element types.

simoll planned changes to this revision.Aug 20 2020, 9:10 AM
simoll marked 6 inline comments as done.
simoll added inline comments.
clang/docs/LanguageExtensions.rst
492

Not exactly: It is the alignment of the corresponding integer type.
For example the following C code:

typedef bool bool256 __attribute__((vector_size(256)));
bool256 P;

gives you (x86_64 host, no machine flags specified):

%P = alloca i2048, align 32
clang/include/clang/AST/Type.h
3248

I gave it this specific name so where ever it is used, it documents that whatever depends on it is specific to vector_size bool. This should be cleaned up when a better boolean vector type is introduced.

clang/lib/Sema/SemaExpr.cpp
9775

That's an artifact. Will narrow this down to isVectorSizeBoolean().

11914

I guess so. Will remove this case.

11941

Agreed.

clang/lib/Sema/SemaExprCXX.cpp
6271

Sure, will allow it.

rsandifo-arm added inline comments.Aug 21 2020, 1:15 AM
clang/docs/LanguageExtensions.rst
492

I guess my point is pedantic, but what I meant was: does the concept of “the corresponding integer type” necessarily exist at the C/C++ level? E.g. how would you end up with the same LLVM IR statement using C integer types instead of vectors?

My worry was that “the alignment of the corresponding integer type” would only be meaningful to the user if they could identify what the corresponding (C) integer type actually was, and be able to determine its alignment that way.

simoll marked 4 inline comments as done.Aug 21 2020, 1:35 AM
simoll added inline comments.
clang/docs/LanguageExtensions.rst
492

To be honest, i hadn't given alignment much thought before. Actually, i am not too happy about bringing in integer types here.

I guess it's sensible to do no different than for the other vector types: take the size rounded up to the next power-ot-two capped by the target's maximum vector alignment.

simoll updated this revision to Diff 286971.Aug 21 2020, 2:12 AM
simoll edited the summary of this revision. (Show Details)
  • Rebased.
  • Allow comparisons on boolean vectors.
  • Restored result type for vector comparisons on other types.
  • Added operator, alignment and constexpr tests.
lenary removed a subscriber: lenary.Aug 21 2020, 2:23 AM

I'm not qualified to review the CodeGen stuff (or accept the patch, obvs. :-)) but FWIW, here are some comments on the doc and Sema side.

Thanks for your comments! Maybe you know people who could review the CodeGen side of this..?

The Hexagon builtins explicitly require bool vectors to have 8-bit wide bool elements. That clashes with our new vector type because bool vectors in builtin functions and vector_size bool vectors have the same internal representation in Clang. However, AFAIK Hexagon is the only target that actually need 8-bit bools. I see two possible ways out of this:

  1. Introduce a Bit type in Clang that is only constructible as a vector element type. This way, any existing users of generic bool vectors are undisturbed. VE can use Bit in its builtin declarations to match the vector_size bool type. We may re-purpose the Bit type for a proper boolean vector type in the future.
  2. Add a new target property: BoolInVectorSize that is the size of a bool element in boolean vectors. Hexagon, as most targets, would define BoolInVectorSize = BoolSize. VE would define BoolInVectorSize = 1 for the intended behavior.

My preference is for 2. since it means small changes to the current patch and less surprises than the other option. Opinions?

simoll updated this revision to Diff 287658.Aug 25 2020, 7:12 AM
  • made bool-vector layout target dependent (one byte per bool element on Hexagon, one bit everywhere else).
  • Added hvx-specific debug info test.
  • Updated docs.
simoll updated this revision to Diff 287692.Aug 25 2020, 9:32 AM

NFC. Make clang-tidy happy.

rsmith added a subscriber: rsmith.Aug 25 2020, 2:03 PM

Have you considered using the ext_vector_type attribute instead of vector_size for this? That would have a couple of advantages:

  • It's not a GCC attribute, so there's no risk that GCC would give different meaning to the same syntax (such as using one byte per vector element, which would be the natural meaning for a vector of bool as an extension of existing functionality).
  • ext_vector_type takes as an argument a number of elements, not a size in bytes, so you could support vectors of non-multiple-of-8 bools.

However, using either an ext_vector_type of bool or a vector_size of bool would still seem problematic in a couple of other ways:

  • Those types always produce a vector whose element type is the specified type; you can form an lvalue denoting the element, the element is always represented the same as the underlying type, and you can pun between a vector of N T and an array of N T. GCC even permits forming a pointer or reference to a vector element, and you should assume that Clang will eventually support that too. That would not be possible for bool if packed as a bit-vector.
  • You can already perform logical operations between our existing vector types -- for example, comparing two vectors of 4 ints. If vectors of bool are valid, it would be logical and natural to expect a vector of 4 bools to have some connection to the result of such a comparison, but it won't: the type of a comparison of two vectors of 4 ints is a vector of 4 ints each of which is either 0 or -1, and we can't convert that to a vector of 4 bools represented as our existing vector types, because such conversions would be interpreted as bitcasts, and we'd reject because the bitwidth of the vectors is different.
  • Vectors of integer types (such as bool) generally support arithmetic operators, which you presumably do not want to support here.
  • Our existing vector types carry a lot of baggage (for example, lax vector conversions, and the egregious attribute-on-typedef-changes-the-type hack) that we would not want a new type to include.

So I think you should seriously consider using a new syntax to form a bit-vector type. It seems to me that an N-bit bit-vector is quite similar to an _ExtInt(N), but without the arithmetic support and with a different expected IR type. Maybe following that syntactic path and adding _BitVector(N) syntax would work out better?

Thanks for dropping by!

Have you considered using the ext_vector_type attribute instead of vector_size for this? That would have a couple of advantages:

  • It's not a GCC attribute, so there's no risk that GCC would give different meaning to the same syntax (such as using one byte per vector element, which would be the natural meaning for a vector of bool as an extension of existing functionality).
  • ext_vector_type takes as an argument a number of elements, not a size in bytes, so you could support vectors of non-multiple-of-8 bools.

Whichever solution is going to land in Clang, we will end up in the situation that Clang supports bool vectors and GCC does not. Any code that uses this feature and is supposed to compile with GCC will need some #ifdef guard. This is why i am not too worried about (ab-)using the vector_size attribute for this.

However, using either an ext_vector_type of bool or a vector_size of bool would still seem problematic in a couple of other ways:

  • Those types always produce a vector whose element type is the specified type; you can form an lvalue denoting the element, the element is always represented the same as the underlying type, and you can pun between a vector of N T and an array of N T. GCC even permits forming a pointer or reference to a vector element, and you should assume that Clang will eventually support that too. That would not be possible for bool if packed as a bit-vector.
  • You can already perform logical operations between our existing vector types -- for example, comparing two vectors of 4 ints. If vectors of bool are valid, it would be logical and natural to expect a vector of 4 bools to have some connection to the result of such a comparison, but it won't: the type of a comparison of two vectors of 4 ints is a vector of 4 ints each of which is either 0 or -1, and we can't convert that to a vector of 4 bools represented as our existing vector types, because such conversions would be interpreted as bitcasts, and we'd reject because the bitwidth of the vectors is different.
  • Vectors of integer types (such as bool) generally support arithmetic operators, which you presumably do not want to support here.
  • Our existing vector types carry a lot of baggage (for example, lax vector conversions, and the egregious attribute-on-typedef-changes-the-type hack) that we would not want a new type to include.

I agree that the vector_size syntax for bool vectors raises some expectations on the behavior of the bool vector. However, i see this more as an issue of proper documentation. The moment somebody uses vector_size attributes they know they are deep in compiler-extension land and far from standard C/C++ semantics.
Now about the vector representation issues in Clang and how we could make this work for bool vectors: earlier, i suggested that we could introduce a Clang-internal Bit datatype that can only be used as a vector element type. By doing so, we wouldn't violate the invariant that bits(vector) == size_of_vector * bits(element_type).

So I think you should seriously consider using a new syntax to form a bit-vector type. It seems to me that an N-bit bit-vector is quite similar to an _ExtInt(N), but without the arithmetic support and with a different expected IR type. Maybe following that syntactic path and adding _BitVector(N) syntax would work out better?

I am open to a different syntax, if piggy backing on the vector_size syntax is the main point of contention. Up to now, the plan was to have vector_size bool now and work on a proper bool vector type to replace it.

simoll updated this revision to Diff 289893.Sep 4 2020, 1:38 AM

Rebased

I'd like to hear your feedback on how to proceed with this patch.

Here is what i would do:

For this patch, we switch to ext_vector_type for bool vectors to not clash head on with GCC should GCC decide to allow vector_size bool with a different semantics in the future.
For the future, we propose a _BitVector(N) extension possibly aiming for a C/C++ TS. We can build on our experience with the ext_vector_type bool for this.

simoll updated this revision to Diff 296135.Oct 5 2020, 3:16 AM
This comment was removed by simoll.
Herald added a project: Restricted Project. · View Herald TranscriptOct 5 2020, 3:16 AM
simoll updated this revision to Diff 296137.Oct 5 2020, 3:20 AM

(wrong Diff attached in earlier update)

  • rebased
simoll added a comment.Oct 6 2020, 8:34 AM

D88905 uses ext_vector_type instead of vector_size following up on @rsmith 's suggestion. The review should continue there.

I'll keep D81083 around for reference.

simoll abandoned this revision.Nov 11 2020, 1:00 AM

Taking this off the review queues - we will use the ext_vector_type attribute instead (D88905).