This is an archive of the discontinued LLVM Phabricator instance.

[x86] AVX FP compare builtins should require AVX target feature (PR28112)
ClosedPublic

Authored by spatel on Jun 13 2016, 1:37 PM.

Details

Summary

This is a fix for PR28112:
https://llvm.org/bugs/show_bug.cgi?id=28112

The FP comparison intrinsics that take an immediate parameter rather than specifying a comparison predicate in the function name were added with AVX (these are macros in avxintrin.h). This makes clang behave more like like gcc and matches the Intel documentation, eg:
VCMPPS: m128 _mm_cmp_ps(m128 a, __m128 b, const int imm)

'V' means this is intended to only work with the AVX form of the instruction.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 60596.Jun 13 2016, 1:37 PM
spatel retitled this revision from to [x86] AVX FP compare builtins should require AVX target feature (PR28112).
spatel updated this object.
spatel added reviewers: echristo, bogner, RKSimon.
spatel added a subscriber: cfe-commits.
echristo edited edge metadata.Jun 13 2016, 3:53 PM

The 128 bit versions should only be selecting for sse functions and shouldn't need avx to work? What instructions are getting emitted here?

The 128 bit versions should only be selecting for sse functions and shouldn't need avx to work? What instructions are getting emitted here?

No, the 128-bit versions of these C intrinsics are strictly for AVX versions of the instructions (eg, vcmpps).

Example from the bug report:

cmp = _mm_cmp_ps((__m128)a, (__m128)b, 8);

We're currently emitting illegal SSE instructions like:

cmpps	$0x8, %xmm1, %xmm0  <--- anything over '7' is reserved/undef
RKSimon edited edge metadata.Jun 15 2016, 9:57 AM

It seems like part of the need for this is because the _mm_cmp_ps style intrinsics are defined as macros (to get around the problem of trying to use an immediate as an argument):

#define _mm_cmp_ps(a, b, c) __extension__ ({ \
  (__m128)__builtin_ia32_cmpps((__v4sf)(__m128)(a), \
                               (__v4sf)(__m128)(b), (c)); })

which means clang can't use a target("avx") attribute to stop their use.

Given that I'm happy with this patch's approach - anyone else have any suggestions?

RKSimon accepted this revision.Jun 21 2016, 11:31 AM
RKSimon edited edge metadata.

LGTM - the compile warning is clear and it could be a problem if we allow undefined values through on pre-AVX targets.

The only other thing we could do is handle these in CGBuiltin and 'accept' 0-7 values through on sse/sse2 targets and assert on other values but I don't see how this would be better.

This revision is now accepted and ready to land.Jun 21 2016, 11:31 AM
This revision was automatically updated to reflect the committed changes.