This is an archive of the discontinued LLVM Phabricator instance.

Specify Clang vector builtins.
ClosedPublic

Authored by fhahn on Oct 11 2021, 3:12 AM.

Diff Detail

Event Timeline

fhahn requested review of this revision.Oct 11 2021, 3:12 AM
fhahn created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptOct 11 2021, 3:12 AM
craig.topper added inline comments.Oct 11 2021, 5:31 AM
clang/docs/LanguageExtensions.rst
552

I'm not sure I understand what is being concatenated here.

scanon added inline comments.Oct 11 2021, 7:39 AM
clang/docs/LanguageExtensions.rst
538

"Prevailing rounding mode" is not super-useful, other than as a spelling for round-to-nearest-ties-to-even (IEEE 754 default rounding). Outside of a FENV_ACCESS ON context, there's not even really a notion of "prevailing rounding mode" to appeal to. I assume the intent is for this to lower to e.g. x86 ROUND* with the dynamic rounding-mode immediate.

I would recommend adding __builtin_elementwise_roundeven(T x) instead, which would statically bind IEEE default rounding (following TS 18661-1 naming) without having to appeal to prevailing rounding mode, and can still lower to ROUND* on x86 outside of FENV_ACCESS ON contexts, which is the norm for vector code (and FRINTN unconditionally on armv8). I think we can punt on rint/nearbyint for now, and add them in the future if there's a need.

scanon added inline comments.Oct 11 2021, 7:40 AM
clang/docs/LanguageExtensions.rst
565

Should be restricted to integer types.

scanon added inline comments.Oct 11 2021, 7:41 AM
clang/docs/LanguageExtensions.rst
565

(Never mind, somehow read this as & instead of \+.)

fhahn updated this revision to Diff 378992.Oct 12 2021, 5:56 AM

Try to be more precise about how the reduction steps are performed and replace _round and _rint by roundeven.

fhahn marked 4 inline comments as done.Oct 12 2021, 6:10 AM
fhahn added inline comments.
clang/docs/LanguageExtensions.rst
538

I removed rint and round for now and add` _roundeven` with the wording from TS 18661-1

552

I tried to spell it out more clearly. I'm still not sure if that spells it out as clearly as possibly and I'd appreciate any suggestions on how to improve the wording.

craig.topper added inline comments.Oct 12 2021, 8:15 AM
clang/docs/LanguageExtensions.rst
552

The input is a single vector. I'm not understanding where we get a second vector to concatenate.

fhahn updated this revision to Diff 379517.Oct 13 2021, 1:39 PM
fhahn marked 2 inline comments as done.

Another stab at phrasing the reduction step. Also added a note that the implementation is work-in-progress.

fhahn marked an inline comment as done.Oct 13 2021, 1:40 PM
fhahn added inline comments.
clang/docs/LanguageExtensions.rst
552

Oh yes, now I see where the confusion was coming from. I was thinking about the reduction tree and how the input is broken up. Sorry for the confusing wording. I gave it another try, should be much simpler again now.

scanon accepted this revision.Oct 13 2021, 1:50 PM

I'm happy with this now.

clang/docs/LanguageExtensions.rst
552

It's unclear because there's no apparent "first" or "second" vector; there's just a single argument, and the result isn't a vector, it's a scalar. I think you want to say something like: "the operation is repeatedly applied to adjacent pairs of elements until the result is a scalar" and then provide a worked example.

This revision is now accepted and ready to land.Oct 13 2021, 1:50 PM
craig.topper added inline comments.Oct 13 2021, 2:15 PM
clang/docs/LanguageExtensions.rst
552

Should it somehow mention the pair is the even element i and the odd element i+1. There are n-1 adjacent pairs in an n element vector, but we want non-overlapping pairs.

Should probably spell out the non-power2 behavior. Presumably we pad identity elements after the last element to widen the vector out to a power 2 and then proceed normally?

kparzysz added inline comments.
clang/docs/LanguageExtensions.rst
553

It's really not clear what "horizontal recursive pairwise" means unless one has read the mailing list discussions. Maybe you could spell it out, e.g. "recursive even-odd pairwise reduction" or something like that.

fhahn updated this revision to Diff 380351.Oct 18 2021, 5:11 AM
fhahn marked an inline comment as done.

Thanks for the latest set of comments!

I tried to incorporate the suggestions about improving the reduction wording. I also added an example.

I also put up 2 patches to start with the implementation of min/max and the abs builtins in D111985 and D111986. I adjusted the supported types for __builtin_elementwise_abs to *signed* integer and floating point types.

fhahn updated this revision to Diff 380352.Oct 18 2021, 5:18 AM
fhahn marked an inline comment as done.

adjust padding wording.

fhahn marked 2 inline comments as done.Oct 18 2021, 5:18 AM
fhahn added inline comments.
clang/docs/LanguageExtensions.rst
552

Should it somehow mention the pair is the even element i and the odd element i+1. There are n-1 adjacent pairs in an n element vector, but we want non-overlapping pairs.

Thanks, I tried to update the wording to make it clear that it operates on even-odd non-overlapping pairs.

Should probably spell out the non-power2 behavior. Presumably we pad identity elements after the last element to widen the vector out to a power 2 and then proceed normally?

Good point, done!

I think you want to say something like: "the operation is repeatedly applied to adjacent pairs of elements until the result is a scalar" and then provide a worked example.

Used and added an example.

553

Thanks, I used that wording!

craig.topper added inline comments.Oct 18 2021, 11:07 AM
clang/docs/LanguageExtensions.rst
557

widening -> widened

fhahn updated this revision to Diff 380500.Oct 18 2021, 12:53 PM
fhahn marked an inline comment as done.

Fix wording: widening -> widened, thanks!

fhahn marked an inline comment as done.Oct 18 2021, 12:54 PM
fhahn added inline comments.
clang/docs/LanguageExtensions.rst
557

thanks, should be fixed!

kito-cheng added inline comments.
clang/docs/LanguageExtensions.rst
579

The example above use __builtin_reduce_fadd, but not listed here? or should we just use __builtin_reduce_add for floating point and fix the example?

fhahn updated this revision to Diff 380598.Oct 19 2021, 12:56 AM
fhahn marked an inline comment as done.

Thanks @kito-cheng, the example should use __builtin_reduce_add instead of _fadd! Fixed

fhahn marked an inline comment as done.Oct 19 2021, 1:16 AM
fhahn added inline comments.
clang/docs/LanguageExtensions.rst
579

Thanks it should be _add instead of _fadd. fixed.

fhahn updated this revision to Diff 380612.Oct 19 2021, 2:04 AM
fhahn marked an inline comment as done.

Following feedback from D111986, explicitly spell out abs behavior of most negative integer as undefined.

fhahn updated this revision to Diff 380755.Oct 19 2021, 12:32 PM

As @scanon pointed out in D111986, most simd implementations should handle abs(INT_MIN) consistently by returngin INT_MIN. It's therefore better to avoid defining abs(INT_MIN) as UB, which would make the intrinsic more difficult to use. I updated the abs definition to spell out the behavior of abs(INT_MIN) as returning INT_MIN.

This revision was automatically updated to reflect the committed changes.