This is an archive of the discontinued LLVM Phabricator instance.

[X86] Make f16c intrinsics accessible through emmintrin.h, per Intel docs
ClosedPublic

Authored by mkuper on Sep 21 2015, 5:57 AM.

Diff Detail

Event Timeline

mkuper updated this revision to Diff 35237.Sep 21 2015, 5:57 AM
mkuper retitled this revision from to [X86] Make f16c intrinsics accessible through emmintrin.h, per Intel docs.
mkuper updated this object.
mkuper added a reviewer: AsafBadouh.
mkuper added a subscriber: cfe-commits.
AsafBadouh accepted this revision.Sep 21 2015, 6:26 AM
AsafBadouh edited edge metadata.

LGTM

This revision is now accepted and ready to land.Sep 21 2015, 6:26 AM
This revision was automatically updated to reflect the committed changes.

Regarding _mm[256]_cvtps_ph and ...cvtph_ps, I can find an Intel doc that say the _mm_* functions are in emmintrin.h and the _mm256_* are in immintrin.h, so putting them all with emmintrin.h is not consistent with what Intel says. I can find a Microsoft doc that say these are all available with immintrin.h (not emmintrin.h). It looks like gcc puts them with immintrin.h also.

Can we move the #include of f16cintrin.h from emmintrin.h to immintrin.h? It would be more compatible with everybody else, and equally inconsistent with Intel's doc AFAICT.

mkuper added a subscriber: mkuper.Nov 30 2015, 11:23 PM

Hi Paul,

I'd rather not move them into immintrin.h. The current situation is that if you include emmintrin.h you get "too many" intrinsics w.r.t to the Intel docs (and immintrin.h is just right, since it recursively includes emmintrin.h), while if we move it to immintrin.h, including emmintrin.h would provide "not enough" intrinsics. So, in terms of compatibility, I think this is slightly better.
Regarding MSVC and GCC - MSVC doesn't have an emmintrin.h (or any other specialized intrinsic file) and puts everything in immintrin.h. GCC, in my opinion, should probably also be fixed. :-)

On the other hand, I undersand why providing _mm256 stuff in emmintrin.h may be a problem. Perhaps we can split the file in two (or move the relevant intrinsics directly into emm/immintrin)?
That would means our header structure is slightly different from GCC's, but since this file should not be included directly anyway, that doesn't really bother me.
Does that sound reasonable to you?

Michael

Thanks Michael! Moving the mm256 stuff to immintrin.h ought to work. What actually motivates this is mucking around with modularizing the intrinsic headers, and the immediate problem is the duplicate typedefs for v8sf and m256 (which are in both f16cintrin.h and avxintrin.h). If we move the mm256 intrinsics to a point in immintrin.h that comes after where it includes avxintrin.h, then the duplicate typedefs can go away.

I'll see if I can get a patch for this up today.

Patch posted as D15127. please follow up there.