Page MenuHomePhabricator

[X86][SSE] Add _mm_undefined_* intrinsics
ClosedPublic

Authored by RKSimon on Aug 15 2015, 8:56 AM.

Details

Summary

Adds missing SSE/AVX 'undefined' intrinsics (PR24040):

_mm_undefined_pd + _mm256_undefined_pd
_mm_undefined_ps + _mm256_undefined_ps
_mm_undefined_si128 + _mm256_undefined_si256

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 32222.Aug 15 2015, 8:56 AM
RKSimon retitled this revision from to [X86][SSE] Add _mm_undefined_* intrinsics.
RKSimon updated this object.
RKSimon added reviewers: craig.topper, echristo, mkuper.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: cfe-commits.
mkuper edited edge metadata.Aug 16 2015, 12:40 AM

Thanks, Simon!
I've wanted to add the _undefined intrinsics for a while now, but never got to it.

Anyway, this sort of implementation somewhat worries me.
Yes, I know that the gcc intrinsics do something very similar.
And I also know that in practice we'll get an undef value, nothing worse (assuming reading an uninitialized automatic variable is undefined behavior to begin with - which really depends on the spec interpretation :-) ).
And I know this isn't likely to change anytime soon.

Still, relying on what may be undefined behavior in the header files worries me, and I'd rather not have it implemented like that.
I was thinking about adding a __builtin_undef which explicitly resolves to an undef value.
Does that make sense to you?

test/CodeGen/sse-undefined.c
1 ↗(On Diff #32222)

Perhaps a more explicit test?

Thanks, Simon!
I've wanted to add the _undefined intrinsics for a while now, but never got to it.

Anyway, this sort of implementation somewhat worries me.
Yes, I know that the gcc intrinsics do something very similar.
And I also know that in practice we'll get an undef value, nothing worse (assuming reading an uninitialized automatic variable is undefined behavior to begin with - which really depends on the spec interpretation :-) ).
And I know this isn't likely to change anytime soon.

Still, relying on what may be undefined behavior in the header files worries me, and I'd rather not have it implemented like that.
I was thinking about adding a __builtin_undef which explicitly resolves to an undef value.
Does that make sense to you?

__builtin_undef seems like a pretty big hammer and does not sound trivial to implement. Not all types that can be emitted have an undef representation. For example, x86_mmx doesn't have an undef representation because there can be no constants of that type. I'd recommend a more narrow implementation technique unless we really need a more general one.

Yes using that uninitialized value has worried me as well. I originally set it to zero (and considered using __ LINE __ or __ COUNTER __) but both introduce defined behaviour that I could see causing all sorts of problems further down the line in debug vs release builds. How undefined do we want our undefined to be? ;-)

I can create builtin_ia32_undef64mmx / builtin_ia32_undef128 / builtin_ia32_undef256 / builtin_ia32_undef512 if nobody can think of a better alternative?

I’m not sure how much people actually use these, but the AVX-512 versions of these, at least, can be very useful internally to implement AVX-512 intrinsics.
For AVX-512, we use the same GCC builtin for all 3 versions of the intrinsic (pass-through masked, set to zero masked, and unmasked). This is the same implementation that’s used in GCC, and is fairly clean, since the only difference is in the desired pass-through values (actual value, zero, or undef).

However, since we don’t actually have the undef intrinsics right now, we put a zero in the unmasked version as well, which is definitely a pessimization.
The plan is to change them to use undef once the undef intrinsics are implemented.

From: Eric Christopher [mailto:echristo@gmail.com]
Sent: Monday, August 17, 2015 21:33
To: reviews+D12052+public+a6057f04f570e35c@reviews.llvm.org; llvm-dev@redking.me.uk; craig.topper@gmail.com; Kuperstein, Michael M
Cc: david.majnemer@gmail.com; Badouh, Asaf; cfe-commits@lists.llvm.org; Richard Smith
Subject: Re: [PATCH] D12052: [X86][SSE] Add _mm_undefined_* intrinsics

RKSimon updated this revision to Diff 32514.Aug 19 2015, 1:21 AM
RKSimon edited edge metadata.

Added ia32 builtin undef intrinsics (I didn't bother with the mmx as I can't find any evidence of an undefined intrinsic for it). Added the avx512 intrinsics referenced in the intel intrinsics guide.

Technically there's nothing stopping us making these builtin more general (non x86 specific) - I don't know if people want us to go that way though?

I'll make the tests more explicit once we're happy that this is the way to go.

I think this is slightly less elegant than having a generic builtin, but I'm just fine with it, especially if David/Eric prefer it to the generic version.

Actually, thinking about it a bit more, a generic builtin most probably won't be more elegant.
LGTM.

This revision was automatically updated to reflect the committed changes.