This is an archive of the discontinued LLVM Phabricator instance.

Regarding _mm[256]_cvtps_ph and ...cvtph_ps, I can find an Intel doc that say the _mm_* functions are in emmintrin.h and the _mm256_* are in immintrin.h, so putting them all with emmintrin.h is not consistent with what Intel says. I can find a Microsoft doc that say these are all available with immintrin.h (not emmintrin.h). It looks like gcc puts them with immintrin.h also.

Can we move the #include of f16cintrin.h from emmintrin.h to immintrin.h? It would be more compatible with everybody else, and equally inconsistent with Intel's doc AFAICT.

Hi Paul,

I'd rather not move them into immintrin.h. The current situation is that if you include emmintrin.h you get "too many" intrinsics w.r.t to the Intel docs (and immintrin.h is just right, since it recursively includes emmintrin.h), while if we move it to immintrin.h, including emmintrin.h would provide "not enough" intrinsics. So, in terms of compatibility, I think this is slightly better.
Regarding MSVC and GCC - MSVC doesn't have an emmintrin.h (or any other specialized intrinsic file) and puts everything in immintrin.h. GCC, in my opinion, should probably also be fixed. :-)

On the other hand, I undersand why providing _mm256 stuff in emmintrin.h may be a problem. Perhaps we can split the file in two (or move the relevant intrinsics directly into emm/immintrin)?
That would means our header structure is slightly different from GCC's, but since this file should not be included directly anyway, that doesn't really bother me.
Does that sound reasonable to you?

Michael

Thanks Michael! Moving the mm256 stuff to immintrin.h ought to work. What actually motivates this is mucking around with modularizing the intrinsic headers, and the immediate problem is the duplicate typedefs for v8sf and m256 (which are in both f16cintrin.h and avxintrin.h). If we move the mm256 intrinsics to a point in immintrin.h that comes after where it includes avxintrin.h, then the duplicate typedefs can go away.

I'll see if I can get a patch for this up today.

Patch posted as D15127. please follow up there.

Revision Contents

Path

Size

cfe/

trunk/

lib/

Headers/

emmintrin.h

2 lines

f16cintrin.h

4 lines

Diff 35250

cfe/trunk/lib/Headers/emmintrin.h

	Show All 29 Lines
	typedef long long __m128i __attribute__((__vector_size__(16)));			typedef long long __m128i __attribute__((__vector_size__(16)));

	/* Type defines. */			/* Type defines. */
	typedef double __v2df __attribute__ ((__vector_size__ (16)));			typedef double __v2df __attribute__ ((__vector_size__ (16)));
	typedef long long __v2di __attribute__ ((__vector_size__ (16)));			typedef long long __v2di __attribute__ ((__vector_size__ (16)));
	typedef short __v8hi __attribute__((__vector_size__(16)));			typedef short __v8hi __attribute__((__vector_size__(16)));
	typedef char __v16qi __attribute__((__vector_size__(16)));			typedef char __v16qi __attribute__((__vector_size__(16)));

				#include <f16cintrin.h>

	/* Define the default attributes for the functions in this file. */			/* Define the default attributes for the functions in this file. */
	#define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__, __target__("sse2")))			#define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__, __target__("sse2")))

	static __inline__ __m128d __DEFAULT_FN_ATTRS			static __inline__ __m128d __DEFAULT_FN_ATTRS
	_mm_add_sd(__m128d __a, __m128d __b)			_mm_add_sd(__m128d __a, __m128d __b)
	{			{
	__a[0] += __b[0];			__a[0] += __b[0];
	return __a;			return __a;
	▲ Show 20 Lines • Show All 1,441 Lines • Show Last 20 Lines

cfe/trunk/lib/Headers/f16cintrin.h

	Show All 15 Lines
	* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER			* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
	* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,			* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
	* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN			* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
	* THE SOFTWARE.			* THE SOFTWARE.
	*			*
	*===-----------------------------------------------------------------------===			*===-----------------------------------------------------------------------===
	*/			*/

	#if !defined __X86INTRIN_H && !defined __IMMINTRIN_H			#if !defined __X86INTRIN_H && !defined __EMMINTRIN_H && !defined __IMMINTRIN_H
	#error "Never use <f16cintrin.h> directly; include <x86intrin.h> instead."			#error "Never use <f16cintrin.h> directly; include <emmintrin.h> instead."
	#endif			#endif

	#ifndef __F16CINTRIN_H			#ifndef __F16CINTRIN_H
	#define __F16CINTRIN_H			#define __F16CINTRIN_H

	typedef float __v8sf __attribute__ ((__vector_size__ (32)));			typedef float __v8sf __attribute__ ((__vector_size__ (32)));
	typedef float __m256 __attribute__ ((__vector_size__ (32)));			typedef float __m256 __attribute__ ((__vector_size__ (32)));

	Show All 26 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Make f16c intrinsics accessible through emmintrin.h, per Intel docsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 35250

cfe/trunk/lib/Headers/emmintrin.h

cfe/trunk/lib/Headers/f16cintrin.h

[X86] Make f16c intrinsics accessible through emmintrin.h, per Intel docs
ClosedPublic