This is an archive of the discontinued LLVM Phabricator instance.

[Clang][SVE] Parse builtin type string for scalable vectors
ClosedPublic

Authored by sdesmalen on Feb 27 2020, 1:05 PM.

Details

Summary

This patch adds 'q' to mean 'scalable vector' in the builtin
type string, and for SVE will return the matching builtin
type as defined in the C/C++ language extensions for SVE.

This patch also adds some scaffolding to generate the arm_sve.h
header file, and some builtin definitions (+CodeGen) to be able
to implement some simple masked load intrinsics that use the
ACLE types, such as:

svint8_t test_svld1_s8(svbool_t pg, const int8_t *base) {

return svld1_s8(pg, base);

}

Diff Detail

Event Timeline

sdesmalen created this revision.Feb 27 2020, 1:05 PM
Herald added a project: Restricted Project. · View Herald Transcript

Please update the documentation comment at the top of Builtins.def .

Can we avoid a gigantic, expensive to parse arm_sve.h somehow? Given that the compiler has to know the signatures for all these buitins anyway, can we just make including arm_sve.h set some flag in the compiler to make it recognize all the SVE intrinsics? This is different from the way we handle intrinsics on other targets, but SVE seems like a good opportunity to try a new approach.

clang/lib/CodeGen/CodeGenFunction.cpp
507

I'm not sure that we'll ever want to take advantage of min-legal-vector-width on targets with scalable vectors... but if we did, we'd probably want to use getKnownMinSize() here

sdesmalen updated this revision to Diff 247659.Mar 2 2020, 8:41 AM
sdesmalen marked an inline comment as done.
  • Added comment describing q for scalable vectors to Builtins.def
  • Use getKnownMinSize() for calculating LargestVectorWidth

Can we avoid a gigantic, expensive to parse arm_sve.h somehow? Given that the compiler has to know the signatures for all these buitins anyway, can we just make including arm_sve.h set some flag in the compiler to make it recognize all the SVE intrinsics? This is different from the way we handle intrinsics on other targets, but SVE seems like a good opportunity to try a new approach.

We actually tried this and already simplified the header file as much as possible. The header file now only contains preprocessor macros and function declarations, but no definitions, so should be relatively easy to parse. All code-gen itself is handled in CGBuiltin.
The main reason we couldn't remove the header file is because of the short-forms/overloaded intrinsics. I wouldn't know of a way to create overloaded intrinsics without having to manually do the usual arithmetic conversions similar to how this is currently done for some overloaded builtins like SemaBuiltinAtomicOverloaded in SemaChecking.cpp.

I also wonder if this problem isn't better solved using pre-compiled modules at some point?

Please update the documentation comment at the top of Builtins.def .

Good spot, fixed it!

The main reason we couldn't remove the header file is because of the short-forms/overloaded intrinsics

How are you planning to handle them with your current approach?

Anyway, that isn't really relevant to the question of why we need to #define svld1_u8(...) __builtin_sve_svld1_u8(__VA_ARGS__), as opposed to just making svld1_u8 an intrinsic.

I wouldn't know of a way to create overloaded intrinsics without having to manually do the usual arithmetic conversions similar to how this is currently done for some overloaded builtins like SemaBuiltinAtomicOverloaded in SemaChecking.cpp.

The current Builtins.def infrastructure isn't really designed around it, but it would be possible to add overloaded intrinsics. Currently, when a builtin name is looked up, we create one FunctionDecl. But we could change it to make/return a list of FunctionDecls, instead, and pass them to standard overload resolution.

I also wonder if this problem isn't better solved using pre-compiled modules at some point?

Sort of? I mean, yes, loading a precompiled module is somewhat cheaper than parsing C. But it still isn't free, it doesn't really change the fact that the indirection isn't doing anything useful, and we're still making preprocessing/diagnostics more complicated by using defines.

How are you planning to handle them with your current approach?

Each prototype will get a arm_sve_alias attribute that specifies which builtin it relates to. For example:

__attribute__((__clang_arm_sve_alias(__builtin_sve_svld1_u8)))
svuint8_t svld1(svbool_t, uint8_t const *);

__attribute__((__clang_arm_sve_alias(__builtin_sve_svld1_u32)))
svuint32_t svld1(svbool_t, uint32_t const *);

This is similar to how this was done for MVE.

Anyway, that isn't really relevant to the question of why we need to #define svld1_u8(...) __builtin_sve_svld1_u8(__VA_ARGS__), as opposed to just making svld1_u8 an intrinsic.

This is because many of the builtins are protected by an architecture guard like #ifdef __ARM_FEATURE_SVE or #ifdef __ARM_FEATURE_SVE2.

I wouldn't know of a way to create overloaded intrinsics without having to manually do the usual arithmetic conversions similar to how this is currently done for some overloaded builtins like SemaBuiltinAtomicOverloaded in SemaChecking.cpp.

The current Builtins.def infrastructure isn't really designed around it, but it would be possible to add overloaded intrinsics. Currently, when a builtin name is looked up, we create one FunctionDecl. But we could change it to make/return a list of FunctionDecls, instead, and pass them to standard overload resolution.

Do you happen to know which method in Sema does this? I had a look before, but couldn't find where we could do something like this.

Sort of? I mean, yes, loading a precompiled module is somewhat cheaper than parsing C. But it still isn't free, it doesn't really change the fact that the indirection isn't doing anything useful, and we're still making preprocessing/diagnostics more complicated by using defines.

The reason for using #defines for the non-overloaded intrinsics was mainly performance, the preprocessor was much faster than having to parse all declarations, with the only downside being the redirection in the diagnostics.

This is because many of the builtins are protected by an architecture guard like #ifdef __ARM_FEATURE_SVE or #ifdef __ARM_FEATURE_SVE2.

I think TARGET_HEADER_BUILTIN has support for requiring target features. Granted, we don't use it much at the moment, so there might be some non-obvious issue with this.

Do you happen to know which method in Sema does this? I had a look before, but couldn't find where we could do something like this.

See Sema::LazilyCreateBuiltin, and its caller Sema::LookupBuiltin.

The reason for using #defines for the non-overloaded intrinsics was mainly performance, the preprocessor was much faster than having to parse all declarations, with the only downside being the redirection in the diagnostics.

Oh, that makes sense.

Do you happen to know which method in Sema does this? I had a look before, but couldn't find where we could do something like this.

See Sema::LazilyCreateBuiltin, and its caller Sema::LookupBuiltin.

Thanks for those pointers! I spoke with @simon_tatham today, who is also very interested in this in order to simplify the MVE header file.

I would like to suggest first trying to land the current implementation (using the header file with all the declarations), so that we can use the regression tests added as part of this effort to test a different implementation that does not rely as much on the header file. What do you think?

miyuki added a subscriber: miyuki.Mar 9 2020, 10:01 AM

Changing the way we expose the builtins isn't going to affect most of the code related to the SVE intrinsics. I'm fine sticking with a known working approach, and trying to address that issue later.

Changing the way we expose the builtins isn't going to affect most of the code related to the SVE intrinsics. I'm fine sticking with a known working approach, and trying to address that issue later.

Okay, thanks! In that case, are you happy with the patch as it is now?

  • Updated license header for the arm_sve.h file to use the LLVM license instead of MIT.
efriedma added inline comments.Mar 12 2020, 1:05 PM
clang/utils/TableGen/SveEmitter.cpp
87

I'd prefer to avoid depending on stdbool if it isn't necessary.

93

The ACLE documentation doesn't say anything about float16_t etc.?

sdesmalen marked 2 inline comments as done.Mar 13 2020, 2:42 AM
sdesmalen added inline comments.
clang/utils/TableGen/SveEmitter.cpp
87

Section 3.3 of the ACLE actually mentions this explicitly:

arm_sve.h includes stdint.h and so provides standard types such as uint32_t. When included from C code the header also includes stdbool.h and so provides the bool type.
93

In that same section, it says:

In addition, the header file defines the following scalar data types:
  float16_t equivalent to __fp16
efriedma accepted this revision.Mar 13 2020, 3:07 PM

LGTM

clang/utils/TableGen/SveEmitter.cpp
87

Oh, sorry, I was accidentally looking at an old version of the ACLE spec.

This revision is now accepted and ready to land.Mar 13 2020, 3:07 PM
This revision was automatically updated to reflect the committed changes.