Page MenuHomePhabricator

[Sema][BFloat] Forbid arithmetic on vectors of bfloat.
ClosedPublic

Authored by simon_tatham on Jul 31 2020, 1:27 AM.

Details

Summary

Vectors of bfloat are a storage format only; you're supposed to
explicitly convert them to a wider type to do arithmetic on them.
But currently, if you write something like

bfloat16x4_t test(bfloat16x4_t a, bfloat16x4_t b) { return a + b; }

then the clang frontend accepts it without error, and (ARM or AArch64)
isel fails to generate code for it.

Added a rule in Sema that forbids the attempt from even being made,
and tests that check it. In particular, we also outlaw arithmetic
between vectors of bfloat and any other vector type.

Patch by Luke Cheeseman.

Diff Detail

Event Timeline

simon_tatham created this revision.Jul 31 2020, 1:27 AM
simon_tatham requested review of this revision.Jul 31 2020, 1:27 AM
LukeGeeson accepted this revision.Jul 31 2020, 7:47 AM

This seems sensible and benign, LGTM

This revision is now accepted and ready to land.Jul 31 2020, 7:47 AM
jfb added a comment.Jul 31 2020, 8:45 AM

Is that true of all vector bfloat implementations? It seems like arithmetic on these types is something implementations would likely support.

In D85009#2187549, @jfb wrote:

Is that true of all vector bfloat implementations? It seems like arithmetic on these types is something implementations would likely support.

As I understand it, Arm currently has the only implementation in clang so far. But if other targets disagree, we can make this conditional on getVectorKind(), so that VectorType::NeonVector gets this restriction and other vector types get whatever they need.

jfb added a comment.Jul 31 2020, 9:14 AM
In D85009#2187549, @jfb wrote:

Is that true of all vector bfloat implementations? It seems like arithmetic on these types is something implementations would likely support.

As I understand it, Arm currently has the only implementation in clang so far. But if other targets disagree, we can make this conditional on getVectorKind(), so that VectorType::NeonVector gets this restriction and other vector types get whatever they need.

You mean: only aarch64 backend supports lowering bfloat16 vectors at the moment? Because the clang support isn't "ARM bfloat", it's just bfloat. The tests are ARM bfloat and I think that's fine (i.e. Sema should be able to check ISA-specific problems), but in general this property your checking for seems like a target property.

If I write C or C++ code using bfloat, I'd like to know what that type actually means and what I can do with it. As a developer, it'll be super frustrating once other targets support bfloat... should those target have their own bfloat (because it won't be compatible with ARM's), or should bfloat work differently on different targets?

I actually don't know what the intended approach is here, which is why I'm asking :)

In D85009#2187621, @jfb wrote:
In D85009#2187549, @jfb wrote:

Is that true of all vector bfloat implementations? It seems like arithmetic on these types is something implementations would likely support.

As I understand it, Arm currently has the only implementation in clang so far. But if other targets disagree, we can make this conditional on getVectorKind(), so that VectorType::NeonVector gets this restriction and other vector types get whatever they need.

You mean: only aarch64 backend supports lowering bfloat16 vectors at the moment? Because the clang support isn't "ARM bfloat", it's just bfloat. The tests are ARM bfloat and I think that's fine (i.e. Sema should be able to check ISA-specific problems), but in general this property your checking for seems like a target property.

If I write C or C++ code using bfloat, I'd like to know what that type actually means and what I can do with it. As a developer, it'll be super frustrating once other targets support bfloat... should those target have their own bfloat (because it won't be compatible with ARM's), or should bfloat work differently on different targets?

I actually don't know what the intended approach is here, which is why I'm asking :)

Yes there is an Intel bfloat type too, however we are the only target for the bfloat c/ir type so far. The jury is also out as far as the standards are concerned too, the best we can do now is prevent behavior we know is not compatible, and like Simon says, add some predication later

In D85009#2187621, @jfb wrote:

You mean: only aarch64 backend supports lowering bfloat16 vectors at the moment?

Yes, sorry – I should have said that Arm has the only implementation in an LLVM target. I meant the only one "in clang" in the sense of "compiled into the overall clang binary", which was unclear of me.

I agree that from a front end / language specification perspective this is tricky. That's why I mentioned the vector kind. The scalar bfloat type may have to have the same source-language semantics across all targets, but when it comes to vectors, each target will define a different set of vector types. The Arm header files will be defining something along the lines of

typedef __attribute__((neon_vector_type(8))) bfloat16_t bfloat16x8_t;

and the next target that wants to use a vector of bfloat will presumably do something similar with a different foo_vector_type attribute (and quite likely a different set of vector lengths too).

Vector architectures are more or less certain to vary in the range of operations they permit, so it seems reasonable to me that clang will end up wanting to treat a neon_vector_type vector of bfloats differently from whatever other foo_vector_type is declared. They'll be different types, and conditioning behavior on which one you've got is essentially a way to make it target-specific.

jfb added a comment.Jul 31 2020, 9:25 AM
In D85009#2187621, @jfb wrote:
In D85009#2187549, @jfb wrote:

Is that true of all vector bfloat implementations? It seems like arithmetic on these types is something implementations would likely support.

As I understand it, Arm currently has the only implementation in clang so far. But if other targets disagree, we can make this conditional on getVectorKind(), so that VectorType::NeonVector gets this restriction and other vector types get whatever they need.

You mean: only aarch64 backend supports lowering bfloat16 vectors at the moment? Because the clang support isn't "ARM bfloat", it's just bfloat. The tests are ARM bfloat and I think that's fine (i.e. Sema should be able to check ISA-specific problems), but in general this property your checking for seems like a target property.

If I write C or C++ code using bfloat, I'd like to know what that type actually means and what I can do with it. As a developer, it'll be super frustrating once other targets support bfloat... should those target have their own bfloat (because it won't be compatible with ARM's), or should bfloat work differently on different targets?

I actually don't know what the intended approach is here, which is why I'm asking :)

Yes there is an Intel bfloat type too, however we are the only target for the bfloat c/ir type so far. The jury is also out as far as the standards are concerned too, the best we can do now is prevent behavior we know is not compatible, and like Simon says, add some predication later

Language-wise I think https://wg21.link/p1467 is where C++ is going, and C is taking a similar approach.

I'd like to make sure this is well thought out. Not just "the ISA does this, let's do the same". We know other ISAs act differently, and I'm not clear on what the intended behavior will be for people writing C and C++ code.

In D85009#2187643, @jfb wrote:

Language-wise I think https://wg21.link/p1467 is where C++ is going, and C is taking a similar approach.

That doesn't seem to mention vectors at all. As I said on Friday, I wouldn't disagree with the idea that scalar bfloat should have consistent semantics across compiler targets.

I'd like to make sure this is well thought out. Not just "the ISA does this, let's do the same". We know other ISAs act differently, and I'm not clear on what the intended behavior will be for people writing C and C++ code.

But C-language bindings for vector subsystems have always been strongly tied to the capabilities of the ISA. You enable them in the first place by including some target-specific header such as arm_neon.h or wmmintrin.h or what have you, and then the available operations are whatever are defined by the organization that specified that header.

In this case, bfloat16x4_t and bfloat16x8_t are type names defined in arm_neon.h. That header is defined by Arm (in the ACLE spec), and you only include it and use its definitions if you know you're targeting the Arm ISA. So it makes sense that the available operations should correspond to the things you can do efficiently on that ISA.

If you wanted cross-platform vector bfloat code, you'd include some other header that nobody has defined yet. Or, more likely, you'd just write scalar bfloat source code, and rely on the compiler to vectorise it for each target.

This discussion seems to have wound down. I'll land this patch tomorrow on the strength of @LukeGeeson's review, unless you have strong objections, @jfb?