[libcxx] <experimental/simd> Add ABI tags, class template simd/simd_mask implementations.
[libcxx] <experimental/simd> Add traits is_abi_tag[_v], is_simd[_v] and is_simd_mask[_v].
[libcxx] <experimental/simd> Add related tests.
Details
- Reviewers
philnik - Group Reviewers
Restricted Project - Commits
- rG0e30dd44adc9: [libcxx] <experimental/simd> Add ABI tags, class template simd/simd_mask…
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Yes, that's essentially it. I would suggest a patch that just removes everything from the header and the tests and then have a stack that re-implements the already-implemented stuff. Then the stack can be landed together. I think it would make sense to not implementing anything new in this stack between removing the implementation and having everything that is already implemented again, so it's essentially just a bunch of refactoring. On top of that we can them implement more of the interface. Looking at our current implementation, there is not that much implemented, so I guess the stack shouldn't be that large.
libcxx/include/experimental/__simd/scalar.h | ||
---|---|---|
17 | Yes, it's the more typical way, but AFAICT mostly for symmetry with _LIBCPP_BEGIN_NAMESPACE_STD. There it makes a lot more sense IMO, since that namespace actually changes depending on the configuration. That's not the case for experimental stuff. We also don't have _LIBCPP_BEGIN_NAMESPACE_STD_RANGES/FORMAT/etc.. |
Ideally you would use ISO/IEC TS 19570:2018 which is the real Standard. Based on the editor's report N4809 [1] there should be no normative changes between the Standard and this document.
Thanks for working on this! Incidentally, I recently received a report that our simd implementation was broken in all sorts of way, and it's great to get it sorted out.
libcxx/include/experimental/__simd/scalar.h | ||
---|---|---|
17 | Is there any reason for not doing (formatted however you prefer): _LIBCPP_BEGIN_NAMESPACE_STD namespace experimental { inline namespace parallelism_v2 { .... }} _LIBCPP_END_NAMESPACE_STD ? Then we could get rid of _LIBCPP_BEGIN_NAMESPACE_EXPERIMENTAL_SIMD_ABI altogether. | |
libcxx/include/experimental/simd | ||
12–55 | I would like to guard this whole header under _LIBCPP_ENABLE_EXPERIMENTAL. Users should use -fexperimental-library to be able to use the TSes. |
libcxx/include/experimental/__simd/scalar.h | ||
---|---|---|
17 | That was basically the idea, just that I wouldn't bother with the versioned namespace, since it's unstable anyways. |
Update the revision. Now it is a child revision of D144698. It includes following stuff:
[libcxx] <experimental/simd> Add ABI tags, class template simd/simd_mask implementations.
[libcxx] <experimental/simd> Add traits is_abi_tag[_v], is_simd[_v] and is_simd_mask[_v].
[libcxx] <experimental/simd> Add related tests.
libcxx/include/CMakeLists.txt | ||
---|---|---|
898 | I think it would be good to add a status paper, so it's easier to see what it already implemented and what you want to implement in a particular patch. I'd be fine with just having the big groups in it for now and split them up as we implement more. You can look at libcxx/docs/Status/Zip{.rst,Projects.csv} for an example. Section,Description,Dependencies,Assignee,Complete | [parallel.exceptions], "Parallel Exceptions", None, unassigned, |Not Started| | [parallel.execpol], "Execution policies", None, unassigned, |Not Started| | [parallel.alg], "Parallel algorithms", None, unassigned, |Not Started| | [parallel.taskblock], "Task Block", None, unassigned, |Not Started| | [parallel.simd.abi], "simd ABI tags", None, unassigned, |Not Started| | [parallel.simd.traits], "simd type traits", None, unassigned, |Not Started| | [parallel.simd.whereexpr], "Where expression class templates", None, unassigned, |Not Started| | [parallel.simd.class], "Class template simd", None, unassigned, |Not Started| | [parallel.simd.nonmembers], "simd non-member operations", None, unassigned, |Not Started| | [parallel.simd.mask.class], "Reductions", None, unassigned, |Not Started| | [parallel.simd.mask.nonmembers], "Non-member operations", None, unassigned, |Not Started| | |
libcxx/include/experimental/simd | ||
59–60 | Please use the granularized headers. | |
68 | Since this is all new code, please clang-format the whole file. | |
libcxx/test/std/experimental/simd/test_utils.h | ||
38–53 | We have libcxx/test/support/type_algorithms.h to help with this kind of stuff. |
libcxx/include/experimental/simd | ||
---|---|---|
68 | I tried git clang-format HEAD~1 and it said clang-format did not modify any files. Is there anything in this file that does not conform to the LLVM/libcxx Coding Standards? |
Please mark comments you have addressed as "Done" to make it easier to see what you have already worked on.
libcxx/include/experimental/simd | ||
---|---|---|
68 | git clang-format doesn't know this file is supposed to be formatted, since it doesn't have a file extension. You can (and should) remove libcxx/include/experimental/simd from libcxx/utils/data/ignore_format.txt. The CI then shows you what has to be formatted. It should also be possible to do clang-format libcxx/include/experimental/simd. That way clang-format knows that it's supposed to format this file. | |
71–75 | I think we should define the aliases in the headers where the actual tags are. | |
74 | This seems like something that might change from platform to platform. Could you add a TODO: make this platform dependent? | |
85–93 | These should probably get their own headers. I guess the classes are not super small. | |
96–99 | I think it would be easier to move this to it's own header, and then specialize is_abi_tag per ABI tag, i.e. // traits.h template <class _Tp> inline constexpr bool is_abi_tag_v = false; template <class _Tp> struct is_abi_tag : bool_constant<is_abi_tag_v<_Tp>> {}; // scalar.h template <> inline constexpr bool is_abi_tag_v<__scalar> = true; // vec_ext.h template <int _Np> inline constexpr bool is_abi_tag_v<__vec_ext<_Np>> = true; Note that this also implements is_abi_tag in terms of is_abi_tag_v and not the other way around. Most people just use the latter, so we should avoid instantiating a class every time. This part also applies to is_simd and is_simd_mask. |
libcxx/include/experimental/simd | ||
---|---|---|
85–93 | In our design of the SIMD library structure, these specializations of the simd/simd_mask will not have their own headers. This is because the implementations are not divided by the ABI tags of the user interface layer, but by the internal ABI tags. These internal tags do not correspond one-to-one with the user interface layer tags. Let me briefly introduce the current structure design: We divide the structure into three layers, with the specific implementations in the second and third layers. Each layer uses the ABI tags corresponding to that layer for template specialization. The first layer is the external user interface layer, which corresponds to the external ABI tags given in the specification, such as scalar, native, compatible, and fixed_size. The second layer is the common internal implementation layer, which is platform-independent. This layer corresponds to the internal ABI tags __scalar and __vec_ext. The implementation of the __vec_ext ABI tag uses Clang vector extensions (GCC builtins) as the storage type and Clang vector operators and built-ins to implement vector operations. We have tested this on various platforms and confirmed that the implementation of __vec_ext can correctly generate SIMD instructions supported by the current platform. We believe this will be a common solution for all platforms and will also serve as a fallback when the platform does not support SIMD. The third layer is the platform-dependent optimization layer. Since the common layer implementation can correctly generate SIMD instructions, we only need to use some unique platform features or instructions to optimize parts of the implementation with poor performance on the common layer. This work has not yet started, and we may use some platform-specific internal ABI tags such as __AVX, __NEON, etc. at that time. Therefore, |
libcxx/include/experimental/simd | ||
---|---|---|
74 |
These tags will not be platform dependent until the third layer work is started. | |
85–93 |
These specializations has no direct corresponding implementation. In the current implementation, they all indirectly correspond to __vec_ext, but this may also change in the future due to changes of ABI tags. So it seems inappropriate to put them all in vec_ext. h. |
libcxx/include/experimental/simd | ||
---|---|---|
96–99 | This looks great, but there are often references to other interfaces in the implementation of traits. If traits are to be implemented in different ABI tag headers, we may need to add some declarations to the outer layer of these headers. For example, class simd/simd_mask will be referenced in the is_simd/is_simd_mask implementation. Should I use a new header (declaration. h) to add these declarations? |
@philnik Hi! I have resolved most of the comments and marked them as done. Comments that have not yet been marked may require further discussion.
Do you have any other comments? If not, I will continue to organize and submit the remaining content based on this patch. That may also contribute to the discussion of the overall framework.
By the way, does CI need to be fixed? CI errors do not seem to be introduced by this patch.
The CI problems are unrelated.
libcxx/include/experimental/__simd/declarations.h | ||
---|---|---|
15–23 ↗ | (On Diff #513524) | There can be a few more newlines throughout. |
libcxx/include/experimental/simd | ||
74 | Then I'd like a TODO for this, just in case someone else continues the work. | |
85–93 | Moving them into separate headers is for ease of maintenance and avoiding transitively including too many symbols. | |
96–99 | You just specialize the traits in the header where the class is defined. No need to forward-declare anything. |
libcxx/include/experimental/simd | ||
---|---|---|
96–99 | Sorry, my previous description was not clear enough. The main issue is that in our design, for more code reuse and more flexible use of internal interfaces, the structure of the external interface layer and internal implementation layer is not consistent. And template specialization does not directly occur on external interfaces such as simd/simd_mask classes. We have designed four modules, SIMD/MASK storage type and SIMD/MASK operations, for different ABI tags at the implementation layer, which will be used by various external interfaces, including simd/simd_mask, where_expression and non-member functions. i.e. // simd template <class _Tp, class _Abi> class simd { using _Impl = __simd_operations<_Tp, _Abi>; using _Storage = __simd_storage<_Tp, _Abi>; _Storage __s_; simd operator+(const simd& __lhs, const simd& __rhs) { return _Impl::__plus(__lhs.__s_, __rhs.__s_); } } // scalar.h template <class _Tp> struct __simd_storage<_Tp, simd_abi::__scalar> { ...... }; template <class _Tp> struct __simd_operations<_Tp, simd_abi::__scalar> { ...... }; // vec_ext.h template <class _Tp, int _Np> struct __simd_storage<_Tp, simd_abi::__vec_ext<_Np>> { ...... }; template <class _Tp, int _Np> struct __simd_operations<_Tp, simd_abi::__vec_ext<_Np>> { ...... }; Since the specialization does not occur on external interfaces, the external interfaces is now directly placed in the simd header. if the implementation of simd traits is to be placed in a separate header, a forward declaration is required, and if it is placed together with the external interfaces, it will also be placed in the simd header. Do you have any better suggestions for the current code structure? |
libcxx/include/experimental/simd | ||
---|---|---|
96–99 | In addition, most simd traits are platform independent, is it not necessary to specialize for all traits? |
libcxx/include/experimental/simd | ||
---|---|---|
96–99 | I'm a bit confused what exactly you are talking about. Was this meant for a different thread? |
libcxx/include/experimental/simd | ||
---|---|---|
96–99 | I'm sorry my description confused you. Maybe I shouldn't have put this content in this thread. I just want to discuss the overall structure and the use of template specialization. And they have some impact on the implementation of simd traits. What I mainly want to express is that since we do not directly specialize templates on external user interfaces, there will not be directly definitions and implementations of such as |
LGTM % nits. (and sorry for being so slow in reviewing this)
libcxx/test/std/experimental/simd/simd.traits/is_abi_tag.pass.cpp | ||
---|---|---|
56 | ||
57 | Couldn't the static_asserts be inlined here? The CheckIsAbiTag* could simply be tag types. | |
libcxx/test/std/experimental/simd/test_utils.h | ||
36 | This doesn't really seem with it given that it already just ex. |
The updated tests look really nice! Thanks!
You probably have to rebase on top of main to make the CI happy.
libcxx/test/std/experimental/simd/simd.traits/is_abi_tag.pass.cpp | ||
---|---|---|
31–39 | clang-format will be happy with this. | |
libcxx/test/std/experimental/simd/test_utils.h | ||
21–22 | This is just integer_types, right? |
libcxx/test/std/experimental/simd/test_utils.h | ||
---|---|---|
21–22 | No, it is not the same. The first template parameter of the simd class does not accept input of bool type. And the vector type corresponding to bool type is defined by simd_mask class. In "type_algorithms.h", integer_types has the following definition: // libcxx/test/support/type_algorithms.h:98 using integral_types = concatenate_t<character_types, signed_integer_types, unsigned_integer_types, type_list<bool> >; It contains type bool. And we need arithmetic types without bool to test class simd. |
I encountered a problem in subsequent development that currently seems to need to be resolved in this patch.
In the previous comment, you suggested that I place all simd traits in a separate header file traits.h. During the process of adding the rest traits implementations, I found that most of them require the use of other interfaces placed in the simd header, similar to is_simd requires the use of simd classes. Therefore, I added declaration.h as forward declarations. However, there are too many interfaces that need to be referenced in the future, which is not suitable to add forward declarations for all.
I think there are two solutions.
One is to keep all external user interfaces under the simd header, just like our previous implementation.
Another is to categorize all external user interfaces into separate header files. No implementations in the simd header and only including other header files in order. Then we may need to add traits.h, simd.h, simd_mask.h, aligned_tag.h, abi_tag.h, simd_casts.h, where_expression.h, etc.
Which one do you prefer?
I think it's easier to gollow the code when splitting up the headers, so I'd go for option 2.
libcxx/test/std/experimental/simd/test_utils.h | ||
---|---|---|
21–22 | You show the implementation of integral_types, not integer_types ;-) |
@philnik Hi! Could you please help me check the CI errors?
Last time it prompted me to remove the following statement in transitive includes of cxx23 and cxx26.
*** 174,179 **** --- 174,180 ---- experimental/regexregex experimental/setexperimental/memory_resource experimental/setset - experimental/simdcstddef experimental/stringexperimental/memory_resource experimental/stringstring experimental/type_traitsinitializer_list error: command failed with exit status: 1
But this time after I removed, it prompted me to add.
*** 174,179 **** --- 174,180 ---- experimental/regexregex experimental/setexperimental/memory_resource experimental/setset + experimental/simdcstddef experimental/stringexperimental/memory_resource experimental/stringstring experimental/type_traitsinitializer_list error: command failed with exit status: 1
It looks to me like adding it is the correct thing. Do you know which bot didn't like the added line?
libcxx/include/experimental/simd | ||
---|---|---|
61 | This check should be done after including the detail headers. |
libcxx/test/std/experimental/simd/test_utils.h | ||
---|---|---|
23 | These shouldn't be _Uglified, since they are part of the test code. | |
33 | This makes this part a bit more complex, but the recursion inside the tests isn't there anymore, making them quite a bit simpler. | |
36 | It doesn't look like the F argument is ever used. Why did you add it? | |
40–42 |
I've applied for patch locally and reworked the instantiation logic a bit. This is my solution:
//===----------------------------------------------------------------------===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // //===----------------------------------------------------------------------===// #ifndef TEST_UTIL_H #define TEST_UTIL_H #include <utility> #include <experimental/simd> #include "type_algorithms.h" namespace ex = std::experimental::parallelism_v2; constexpr std::size_t max_simd_size = 32; template <template <std::size_t N> class F> struct TestAllSimdAbiFunctor { template <class T, std::size_t N> using sized_abis = types::type_list<ex::simd_abi::fixed_size<N>, ex::simd_abi::deduce_t<T, N>>; template <class T, std::size_t... Ns> void instantiate_with_n(std::index_sequence<Ns...>) { (types::for_each(sized_abis<T, Ns + 1>{}, F<Ns + 1>{}), ...); } template <class T> void operator()() { using abis = types::type_list<ex::simd_abi::scalar, ex::simd_abi::native<T>, ex::simd_abi::compatible<T>>; types::for_each(abis{}, F<1>()); instantiate_with_n<T>(std::make_index_sequence<max_simd_size - 1>{}); } }; template <template <std::size_t N> class Func> void test_all_simd_abi() { using arithmetic_no_bool_types = types::concatenate_t<types::integer_types, types::floating_point_types>; types::for_each(arithmetic_no_bool_types(), TestAllSimdAbiFunctor<Func>()); } #endif // TEST_UTIL_H
//===----------------------------------------------------------------------===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // //===----------------------------------------------------------------------===// // UNSUPPORTED: c++03, c++11, c++14 // <experimental/simd> // // [simd.traits] // template <class T> struct is_abi_tag; // template <class T> inline constexpr bool ex::is_abi_tag_v = ex::is_abi_tag<T>::value; #include "../test_utils.h" #include <experimental/simd> namespace ex = std::experimental::parallelism_v2; class EmptyEntry {}; template <std::size_t N> struct Test { template <class SimdAbi> void operator()() { static_assert(ex::is_abi_tag<SimdAbi>::value); static_assert(!ex::is_abi_tag<EmptyEntry>::value); static_assert(!ex::is_abi_tag<ex::simd< EmptyEntry, SimdAbi>>::value); static_assert(!ex::is_abi_tag<ex::simd_mask< EmptyEntry, SimdAbi>>::value); static_assert(ex::is_abi_tag_v<SimdAbi>); static_assert(!ex::is_abi_tag_v<EmptyEntry>); static_assert(!ex::is_abi_tag_v<ex::simd< EmptyEntry, SimdAbi>>); static_assert(!ex::is_abi_tag_v<ex::simd_mask< EmptyEntry, SimdAbi>>); } }; int main(int, char**) { test_all_simd_abi<Test>(); return 0; }
I think this version is quite a bit easier to follow. WDYT?
The solution looks great. It works well overall, but there's one aspect I believe could be improved. Since a basic simd need type and simdabi 2 factors, type will usually be used in most of the test functors. I intend to pass the type to F.
For clarity, I've made a comparison of the codes which you can review at the following link: https://www.diffchecker.com/2ja2Qq7t/
I think it would be good to add a status paper, so it's easier to see what it already implemented and what you want to implement in a particular patch. I'd be fine with just having the big groups in it for now and split them up as we implement more. You can look at libcxx/docs/Status/Zip{.rst,Projects.csv} for an example.
This should be a good start: