Vector-reduction arithmetic accepts vectors as inputs and produces scalars as outputs.
This class of vector operation forms the basis of many scientific computations.
In vector-reduction arithmetic, the evaluation off is independent of the order of the input elements of V.
Details
- Reviewers
AsafBadouh delena craig.topper igorb aymanmus - Commits
- rG25eb42023355: [X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (max|min)…
rC285493: [X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (max|min)…
rL285493: [X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (max|min)…
Diff Detail
Event Timeline
lib/Headers/avx512fintrin.h | ||
---|---|---|
344 | Do we really need these new set1 macros? The epi ones should be fine shouldn't they? | |
9942 | long long | |
9947 | long long | |
9957 | long long | |
9962 | long long | |
9979 | intialize is misspelled, but even then I don't think this sentence reads right. | |
9985 | Use uppercase 0xFFF.... to match the constants below. | |
9996 | Can we just call the set1 macro outside and pass the result in for Neutral instead of needing T4. | |
10015 | Use uppercase for consistency. | |
10018 | long long | |
10024 | long long | |
10088 | Do these 512-bit shuffles get narrowed to 256-bit and 128-bit ops for the later stages due to the high bit undefs or do we end up doing 512-bit operations all the way through? |
lib/Headers/avx512fintrin.h | ||
---|---|---|
10088 | It will stay all the way 512. This intrinsics only defined on avx512F, and because of that, we can only use the 512bit intrinsics version of the max and min intrinsics. |
LGTM with that 1 comment.
lib/Headers/avx512fintrin.h | ||
---|---|---|
10046 | extra space after the 4th -1 |
Do we really need these new set1 macros? The epi ones should be fine shouldn't they?