This is an archive of the discontinued LLVM Phabricator instance.

emmintrin.h documentation fixes and updates
ClosedPublic

Authored by dyung on Dec 21 2017, 2:14 PM.

Download Raw Diff

Details

Reviewers

RKSimon
craig.topper
kromanova
spatel

Commits

rG0686df106c39: [DOXYGEN] Fix doxygen and content issues in emmintrin.h
rC321669: [DOXYGEN] Fix doxygen and content issues in emmintrin.h
rL321669: [DOXYGEN] Fix doxygen and content issues in emmintrin.h

Summary

This is the result of several patches we made internally to update the documentation that we would like to have reviewed for possible submission.

The changes include:

Fixed innaccurate instruction mappings for various intrinsics.
Fixed description of NaN handling in comparison intrinsics.
Unify description of _mm_store_pd1 to match _mm_store1_pd.
Fix incorrect wording in various intrinsic descriptions. Previously the descriptions used "low-order" and "high-order" when the intended meaning was "even-indexed" and "odd-indexed".
Fix typos.
Add missing italics command (\a) for params and fixed some parameter spellings.

These patches were made by Craig Flores

Diff Detail

Repository: rL LLVM

Event Timeline

dyung created this revision.Dec 21 2017, 2:14 PM

RKSimon added reviewers: RKSimon, craig.topper, kromanova, spatel.Dec 22 2017, 4:25 AM

LGTM

This revision is now accepted and ready to land.Dec 22 2017, 7:17 PM

Closed by commit rL321669: [DOXYGEN] Fix doxygen and content issues in emmintrin.h (authored by dyung). · Explain WhyJan 2 2018, 12:40 PM

This revision was automatically updated to reflect the committed changes.

kromanova added inline comments.Jan 4 2018, 8:12 PM

cfe/trunk/lib/Headers/emmintrin.h
1143	Formatting is inconsistent with the rest of the changes above or below. One sentence here separated by the empty lines, where everywhere else it's 2 sentences.
4683	I'm not sure about this change. Intel documentation says they generate MOVDQ2Q (don't have icc handy to try). However, I've tried on Linux/X86_64 with clang and gcc, - and we just return.

kromanova added inline comments.Jan 8 2018, 6:35 PM

cfe/trunk/lib/Headers/emmintrin.h
3865	It's better if you use the same language as for many intrinsics "before" and "after". Just for consistency purpose. /// This intrinsic is a utility function and does not correspond to a specific /// instruction.
4683	Though I suspect it's possible to generate movdq2q, I couldn't come up with an test to trigger this instruction generation. Should we revert this change? __m64 fooepi64_pi64 (__m128i a, __m128 c) { __m64 x; x = _mm_movepi64_pi64 (a); return x; } on Linux we generate return instruction. I would expect (v)movq %xmm0,%rax to be generated instead of retq. Am I missing something? Why do we return 64 bit integer in xmm register rather than in %rax?
4700	For Linux x86_64 I can only generate VMOVQ (or MOVQ) instructions respectively for AVX/non-AVX case. Can we even generate MOVD+VMOVQ? How we want to document this intrinsic? I have a similar question as above. __m128i foopi64_epi64 (__m64 a) { __m128i x; x = _mm_movpi64_epi64 (a); return x; } Why we generate this code vmovq %xmm0, %rax vmovq %rax, %xmm0 retq } instead of something simple like vmovq %rdi, %xmm0?

Herald added a subscriber: llvm-commits. · View Herald TranscriptJan 8 2018, 6:35 PM

efriedma added a subscriber: efriedma.Jan 8 2018, 6:45 PM

efriedma added inline comments.

cfe/trunk/lib/Headers/emmintrin.h
4683	The x86-64 calling convention rules say that __m64 is passed/returned in SSE registers. Try the following, which generates movdq2q: __m64 foo(__m128i a, __m128 c) { return _mm_add_pi8(_mm_movepi64_pi64(a), _mm_set1_pi8(5)); }

kromanova added inline comments.Jan 8 2018, 8:04 PM

cfe/trunk/lib/Headers/emmintrin.h
4683	Thanks! That explains it :) I can see that MOVDQ2Q gets generated. What about intrinsic below, _mm_movpi64_epi64? Can we ever generate MOVD+VMOVQ as stated in the review? Or should we write VMOVQ / MOVQ?

efriedma added inline comments.Jan 9 2018, 12:45 PM

cfe/trunk/lib/Headers/emmintrin.h
4683	Testcase: #include <immintrin.h> __m128 foo(__m128i a, __m128 c) { return _mm_movpi64_epi64(_mm_add_pi8(_mm_movepi64_pi64(a), _mm_set1_pi8(5))); } In this case, we should generate movq2dq, but currently don't (I assume due to a missing DAGCombine). I don't see how you could ever get MOVD... but see https://reviews.llvm.org/rL321898, which could be the source of some confusion.

Revision Contents

Path

Size

cfe/

trunk/

lib/

Headers/

emmintrin.h

111 lines

Diff 128447

cfe/trunk/lib/Headers/emmintrin.h

Show First 20 Lines • Show All 211 Lines • ▼ Show 20 Lines
static __inline__ __m128d __DEFAULT_FN_ATTRS		static __inline__ __m128d __DEFAULT_FN_ATTRS
_mm_div_pd(__m128d __a, __m128d __b)		_mm_div_pd(__m128d __a, __m128d __b)
{		{
return (__m128d)((__v2df)__a / (__v2df)__b);		return (__m128d)((__v2df)__a / (__v2df)__b);
}		}

/// \brief Calculates the square root of the lower double-precision value of		/// \brief Calculates the square root of the lower double-precision value of
/// the second operand and returns it in the lower 64 bits of the result.		/// the second operand and returns it in the lower 64 bits of the result.
/// The upper 64 bits of the result are copied from the upper double-		/// The upper 64 bits of the result are copied from the upper
/// precision value of the first operand.		/// double-precision value of the first operand.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VSQRTSD / SQRTSD </c> instruction.		/// This intrinsic corresponds to the <c> VSQRTSD / SQRTSD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double] containing one of the operands. The		/// A 128-bit vector of [2 x double] containing one of the operands. The
/// upper 64 bits of this operand are copied to the upper 64 bits of the		/// upper 64 bits of this operand are copied to the upper 64 bits of the
Show All 25 Lines
static __inline__ __m128d __DEFAULT_FN_ATTRS		static __inline__ __m128d __DEFAULT_FN_ATTRS
_mm_sqrt_pd(__m128d __a)		_mm_sqrt_pd(__m128d __a)
{		{
return __builtin_ia32_sqrtpd((__v2df)__a);		return __builtin_ia32_sqrtpd((__v2df)__a);
}		}

/// \brief Compares lower 64-bit double-precision values of both operands, and		/// \brief Compares lower 64-bit double-precision values of both operands, and
/// returns the lesser of the pair of values in the lower 64-bits of the		/// returns the lesser of the pair of values in the lower 64-bits of the
/// result. The upper 64 bits of the result are copied from the upper double-		/// result. The upper 64 bits of the result are copied from the upper
/// precision value of the first operand.		/// double-precision value of the first operand.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VMINSD / MINSD </c> instruction.		/// This intrinsic corresponds to the <c> VMINSD / MINSD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double] containing one of the operands. The		/// A 128-bit vector of [2 x double] containing one of the operands. The
/// lower 64 bits of this operand are used in the comparison.		/// lower 64 bits of this operand are used in the comparison.
Show All 26 Lines
static __inline__ __m128d __DEFAULT_FN_ATTRS		static __inline__ __m128d __DEFAULT_FN_ATTRS
_mm_min_pd(__m128d __a, __m128d __b)		_mm_min_pd(__m128d __a, __m128d __b)
{		{
return __builtin_ia32_minpd((__v2df)__a, (__v2df)__b);		return __builtin_ia32_minpd((__v2df)__a, (__v2df)__b);
}		}

/// \brief Compares lower 64-bit double-precision values of both operands, and		/// \brief Compares lower 64-bit double-precision values of both operands, and
/// returns the greater of the pair of values in the lower 64-bits of the		/// returns the greater of the pair of values in the lower 64-bits of the
/// result. The upper 64 bits of the result are copied from the upper double-		/// result. The upper 64 bits of the result are copied from the upper
/// precision value of the first operand.		/// double-precision value of the first operand.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VMAXSD / MAXSD </c> instruction.		/// This intrinsic corresponds to the <c> VMAXSD / MAXSD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double] containing one of the operands. The		/// A 128-bit vector of [2 x double] containing one of the operands. The
/// lower 64 bits of this operand are used in the comparison.		/// lower 64 bits of this operand are used in the comparison.
▲ Show 20 Lines • Show All 661 Lines • ▼ Show 20 Lines
static __inline__ __m128d __DEFAULT_FN_ATTRS		static __inline__ __m128d __DEFAULT_FN_ATTRS
_mm_cmpnge_sd(__m128d __a, __m128d __b)		_mm_cmpnge_sd(__m128d __a, __m128d __b)
{		{
__m128d __c = __builtin_ia32_cmpnlesd((__v2df)__b, (__v2df)__a);		__m128d __c = __builtin_ia32_cmpnlesd((__v2df)__b, (__v2df)__a);
return (__m128d) { __c[0], __a[1] };		return (__m128d) { __c[0], __a[1] };
}		}

/// \brief Compares the lower double-precision floating-point values in each of		/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] for equality. The		/// the two 128-bit floating-point vectors of [2 x double] for equality.
/// comparison yields 0 for false, 1 for true.		///
		/// The comparison yields 0 for false, 1 for true. If either of the two
		/// lower double-precision values is NaN, 0 is returned.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.		/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __b.		/// compared to the lower double-precision value of \a __b.
/// \param __b		/// \param __b
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __a.		/// compared to the lower double-precision value of \a __a.
/// \returns An integer containing the comparison results.		/// \returns An integer containing the comparison results. If either of the two
		/// lower double-precision values is NaN, 0 is returned.
static __inline__ int __DEFAULT_FN_ATTRS		static __inline__ int __DEFAULT_FN_ATTRS
_mm_comieq_sd(__m128d __a, __m128d __b)		_mm_comieq_sd(__m128d __a, __m128d __b)
{		{
return __builtin_ia32_comisdeq((__v2df)__a, (__v2df)__b);		return __builtin_ia32_comisdeq((__v2df)__a, (__v2df)__b);
}		}

/// \brief Compares the lower double-precision floating-point values in each of		/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if		/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is less than the corresponding value in		/// the value in the first parameter is less than the corresponding value in
/// the second parameter.		/// the second parameter.
///		///
/// The comparison yields 0 for false, 1 for true.		/// The comparison yields 0 for false, 1 for true. If either of the two
		/// lower double-precision values is NaN, 0 is returned.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.		/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __b.		/// compared to the lower double-precision value of \a __b.
/// \param __b		/// \param __b
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __a.		/// compared to the lower double-precision value of \a __a.
/// \returns An integer containing the comparison results.		/// \returns An integer containing the comparison results. If either of the two
		/// lower double-precision values is NaN, 0 is returned.
static __inline__ int __DEFAULT_FN_ATTRS		static __inline__ int __DEFAULT_FN_ATTRS
_mm_comilt_sd(__m128d __a, __m128d __b)		_mm_comilt_sd(__m128d __a, __m128d __b)
{		{
return __builtin_ia32_comisdlt((__v2df)__a, (__v2df)__b);		return __builtin_ia32_comisdlt((__v2df)__a, (__v2df)__b);
}		}

/// \brief Compares the lower double-precision floating-point values in each of		/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if		/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is less than or equal to the		/// the value in the first parameter is less than or equal to the
/// corresponding value in the second parameter.		/// corresponding value in the second parameter.
///		///
/// The comparison yields 0 for false, 1 for true.		/// The comparison yields 0 for false, 1 for true. If either of the two
		/// lower double-precision values is NaN, 0 is returned.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.		/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __b.		/// compared to the lower double-precision value of \a __b.
/// \param __b		/// \param __b
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __a.		/// compared to the lower double-precision value of \a __a.
/// \returns An integer containing the comparison results.		/// \returns An integer containing the comparison results. If either of the two
		/// lower double-precision values is NaN, 0 is returned.
static __inline__ int __DEFAULT_FN_ATTRS		static __inline__ int __DEFAULT_FN_ATTRS
_mm_comile_sd(__m128d __a, __m128d __b)		_mm_comile_sd(__m128d __a, __m128d __b)
{		{
return __builtin_ia32_comisdle((__v2df)__a, (__v2df)__b);		return __builtin_ia32_comisdle((__v2df)__a, (__v2df)__b);
}		}

/// \brief Compares the lower double-precision floating-point values in each of		/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if		/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is greater than the corresponding value		/// the value in the first parameter is greater than the corresponding value
/// in the second parameter.		/// in the second parameter.
///		///
/// The comparison yields 0 for false, 1 for true.		/// The comparison yields 0 for false, 1 for true. If either of the two
		/// lower double-precision values is NaN, 0 is returned.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.		/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __b.		/// compared to the lower double-precision value of \a __b.
/// \param __b		/// \param __b
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __a.		/// compared to the lower double-precision value of \a __a.
/// \returns An integer containing the comparison results.		/// \returns An integer containing the comparison results. If either of the two
		/// lower double-precision values is NaN, 0 is returned.
static __inline__ int __DEFAULT_FN_ATTRS		static __inline__ int __DEFAULT_FN_ATTRS
_mm_comigt_sd(__m128d __a, __m128d __b)		_mm_comigt_sd(__m128d __a, __m128d __b)
{		{
return __builtin_ia32_comisdgt((__v2df)__a, (__v2df)__b);		return __builtin_ia32_comisdgt((__v2df)__a, (__v2df)__b);
}		}

/// \brief Compares the lower double-precision floating-point values in each of		/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if		/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is greater than or equal to the		/// the value in the first parameter is greater than or equal to the
/// corresponding value in the second parameter.		/// corresponding value in the second parameter.
///		///
/// The comparison yields 0 for false, 1 for true.		/// The comparison yields 0 for false, 1 for true. If either of the two
		/// lower double-precision values is NaN, 0 is returned.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.		/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __b.		/// compared to the lower double-precision value of \a __b.
/// \param __b		/// \param __b
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __a.		/// compared to the lower double-precision value of \a __a.
/// \returns An integer containing the comparison results.		/// \returns An integer containing the comparison results. If either of the two
		/// lower double-precision values is NaN, 0 is returned.
static __inline__ int __DEFAULT_FN_ATTRS		static __inline__ int __DEFAULT_FN_ATTRS
_mm_comige_sd(__m128d __a, __m128d __b)		_mm_comige_sd(__m128d __a, __m128d __b)
{		{
return __builtin_ia32_comisdge((__v2df)__a, (__v2df)__b);		return __builtin_ia32_comisdge((__v2df)__a, (__v2df)__b);
}		}

/// \brief Compares the lower double-precision floating-point values in each of		/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if		/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is unequal to the corresponding value in		/// the value in the first parameter is unequal to the corresponding value in
/// the second parameter.		/// the second parameter.
///		///
/// The comparison yields 0 for false, 1 for true.		/// The comparison yields 0 for false, 1 for true. If either of the two
		/// lower double-precision values is NaN, 1 is returned.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.		/// This intrinsic corresponds to the <c> VCOMISD / COMISD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __b.		/// compared to the lower double-precision value of \a __b.
/// \param __b		/// \param __b
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __a.		/// compared to the lower double-precision value of \a __a.
/// \returns An integer containing the comparison results.		/// \returns An integer containing the comparison results. If either of the two
		/// lower double-precision values is NaN, 1 is returned.
static __inline__ int __DEFAULT_FN_ATTRS		static __inline__ int __DEFAULT_FN_ATTRS
_mm_comineq_sd(__m128d __a, __m128d __b)		_mm_comineq_sd(__m128d __a, __m128d __b)
{		{
return __builtin_ia32_comisdneq((__v2df)__a, (__v2df)__b);		return __builtin_ia32_comisdneq((__v2df)__a, (__v2df)__b);
}		}

/// \brief Compares the lower double-precision floating-point values in each of		/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] for equality. The		/// the two 128-bit floating-point vectors of [2 x double] for equality. The
/// comparison yields 0 for false, 1 for true.		/// comparison yields 0 for false, 1 for true.
///		///
/// If either of the two lower double-precision values is NaN, 1 is returned.		/// If either of the two lower double-precision values is NaN, 0 is returned.
		kromanovaUnsubmitted Not Done Reply Inline Actions Formatting is inconsistent with the rest of the changes above or below. One sentence here separated by the empty lines, where everywhere else it's 2 sentences. kromanova: Formatting is inconsistent with the rest of the changes above or below. One sentence here…
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VUCOMISD / UCOMISD </c> instruction.		/// This intrinsic corresponds to the <c> VUCOMISD / UCOMISD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __b.		/// compared to the lower double-precision value of \a __b.
/// \param __b		/// \param __b
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __a.		/// compared to the lower double-precision value of \a __a.
/// \returns An integer containing the comparison results. If either of the two		/// \returns An integer containing the comparison results. If either of the two
/// lower double-precision values is NaN, 1 is returned.		/// lower double-precision values is NaN, 0 is returned.
static __inline__ int __DEFAULT_FN_ATTRS		static __inline__ int __DEFAULT_FN_ATTRS
_mm_ucomieq_sd(__m128d __a, __m128d __b)		_mm_ucomieq_sd(__m128d __a, __m128d __b)
{		{
return __builtin_ia32_ucomisdeq((__v2df)__a, (__v2df)__b);		return __builtin_ia32_ucomisdeq((__v2df)__a, (__v2df)__b);
}		}

/// \brief Compares the lower double-precision floating-point values in each of		/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if		/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is less than the corresponding value in		/// the value in the first parameter is less than the corresponding value in
/// the second parameter.		/// the second parameter.
///		///
/// The comparison yields 0 for false, 1 for true. If either of the two lower		/// The comparison yields 0 for false, 1 for true. If either of the two lower
/// double-precision values is NaN, 1 is returned.		/// double-precision values is NaN, 0 is returned.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VUCOMISD / UCOMISD </c> instruction.		/// This intrinsic corresponds to the <c> VUCOMISD / UCOMISD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __b.		/// compared to the lower double-precision value of \a __b.
/// \param __b		/// \param __b
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __a.		/// compared to the lower double-precision value of \a __a.
/// \returns An integer containing the comparison results. If either of the two		/// \returns An integer containing the comparison results. If either of the two
/// lower double-precision values is NaN, 1 is returned.		/// lower double-precision values is NaN, 0 is returned.
static __inline__ int __DEFAULT_FN_ATTRS		static __inline__ int __DEFAULT_FN_ATTRS
_mm_ucomilt_sd(__m128d __a, __m128d __b)		_mm_ucomilt_sd(__m128d __a, __m128d __b)
{		{
return __builtin_ia32_ucomisdlt((__v2df)__a, (__v2df)__b);		return __builtin_ia32_ucomisdlt((__v2df)__a, (__v2df)__b);
}		}

/// \brief Compares the lower double-precision floating-point values in each of		/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if		/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is less than or equal to the		/// the value in the first parameter is less than or equal to the
/// corresponding value in the second parameter.		/// corresponding value in the second parameter.
///		///
/// The comparison yields 0 for false, 1 for true. If either of the two lower		/// The comparison yields 0 for false, 1 for true. If either of the two lower
/// double-precision values is NaN, 1 is returned.		/// double-precision values is NaN, 0 is returned.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VUCOMISD / UCOMISD </c> instruction.		/// This intrinsic corresponds to the <c> VUCOMISD / UCOMISD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __b.		/// compared to the lower double-precision value of \a __b.
/// \param __b		/// \param __b
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __a.		/// compared to the lower double-precision value of \a __a.
/// \returns An integer containing the comparison results. If either of the two		/// \returns An integer containing the comparison results. If either of the two
/// lower double-precision values is NaN, 1 is returned.		/// lower double-precision values is NaN, 0 is returned.
static __inline__ int __DEFAULT_FN_ATTRS		static __inline__ int __DEFAULT_FN_ATTRS
_mm_ucomile_sd(__m128d __a, __m128d __b)		_mm_ucomile_sd(__m128d __a, __m128d __b)
{		{
return __builtin_ia32_ucomisdle((__v2df)__a, (__v2df)__b);		return __builtin_ia32_ucomisdle((__v2df)__a, (__v2df)__b);
}		}

/// \brief Compares the lower double-precision floating-point values in each of		/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if		/// the two 128-bit floating-point vectors of [2 x double] to determine if
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines
}		}

/// \brief Compares the lower double-precision floating-point values in each of		/// \brief Compares the lower double-precision floating-point values in each of
/// the two 128-bit floating-point vectors of [2 x double] to determine if		/// the two 128-bit floating-point vectors of [2 x double] to determine if
/// the value in the first parameter is unequal to the corresponding value in		/// the value in the first parameter is unequal to the corresponding value in
/// the second parameter.		/// the second parameter.
///		///
/// The comparison yields 0 for false, 1 for true. If either of the two lower		/// The comparison yields 0 for false, 1 for true. If either of the two lower
/// double-precision values is NaN, 0 is returned.		/// double-precision values is NaN, 1 is returned.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VUCOMISD / UCOMISD </c> instruction.		/// This intrinsic corresponds to the <c> VUCOMISD / UCOMISD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __b.		/// compared to the lower double-precision value of \a __b.
/// \param __b		/// \param __b
/// A 128-bit vector of [2 x double]. The lower double-precision value is		/// A 128-bit vector of [2 x double]. The lower double-precision value is
/// compared to the lower double-precision value of \a __a.		/// compared to the lower double-precision value of \a __a.
/// \returns An integer containing the comparison result. If either of the two		/// \returns An integer containing the comparison result. If either of the two
/// lower double-precision values is NaN, 0 is returned.		/// lower double-precision values is NaN, 1 is returned.
static __inline__ int __DEFAULT_FN_ATTRS		static __inline__ int __DEFAULT_FN_ATTRS
_mm_ucomineq_sd(__m128d __a, __m128d __b)		_mm_ucomineq_sd(__m128d __a, __m128d __b)
{		{
return __builtin_ia32_ucomisdneq((__v2df)__a, (__v2df)__b);		return __builtin_ia32_ucomisdneq((__v2df)__a, (__v2df)__b);
}		}

/// \brief Converts the two double-precision floating-point elements of a		/// \brief Converts the two double-precision floating-point elements of a
/// 128-bit vector of [2 x double] into two single-precision floating-point		/// 128-bit vector of [2 x double] into two single-precision floating-point
▲ Show 20 Lines • Show All 648 Lines • ▼ Show 20 Lines	_mm_store_pd(double *__dp, __m128d __a)
(__m128d)__dp = __a;		(__m128d)__dp = __a;
}		}

/// \brief Moves the lower 64 bits of a 128-bit vector of [2 x double] twice to		/// \brief Moves the lower 64 bits of a 128-bit vector of [2 x double] twice to
/// the upper and lower 64 bits of a memory location.		/// the upper and lower 64 bits of a memory location.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c>VMOVDDUP + VMOVAPD / MOVLHPS + MOVAPS </c> instruction.		/// This intrinsic corresponds to the
		/// <c> VMOVDDUP + VMOVAPD / MOVLHPS + MOVAPS </c> instruction.
///		///
/// \param __dp		/// \param __dp
/// A pointer to a memory location that can store two double-precision		/// A pointer to a memory location that can store two double-precision
/// values.		/// values.
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double] whose lower 64 bits are copied to each		/// A 128-bit vector of [2 x double] whose lower 64 bits are copied to each
/// of the values in \a dp.		/// of the values in \a __dp.
static __inline__ void __DEFAULT_FN_ATTRS		static __inline__ void __DEFAULT_FN_ATTRS
_mm_store1_pd(double *__dp, __m128d __a)		_mm_store1_pd(double *__dp, __m128d __a)
{		{
__a = __builtin_shufflevector((__v2df)__a, (__v2df)__a, 0, 0);		__a = __builtin_shufflevector((__v2df)__a, (__v2df)__a, 0, 0);
_mm_store_pd(__dp, __a);		_mm_store_pd(__dp, __a);
}		}

/// \brief Stores a 128-bit vector of [2 x double] into an aligned memory		/// \brief Moves the lower 64 bits of a 128-bit vector of [2 x double] twice to
/// location.		/// the upper and lower 64 bits of a memory location.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VMOVAPD / MOVAPD </c> instruction.		/// This intrinsic corresponds to the
		/// <c> VMOVDDUP + VMOVAPD / MOVLHPS + MOVAPS </c> instruction.
///		///
/// \param __dp		/// \param __dp
/// A pointer to a 128-bit memory location. The address of the memory		/// A pointer to a memory location that can store two double-precision
/// location has to be 16-byte aligned.		/// values.
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double] containing the values to be stored.		/// A 128-bit vector of [2 x double] whose lower 64 bits are copied to each
		/// of the values in \a __dp.
static __inline__ void __DEFAULT_FN_ATTRS		static __inline__ void __DEFAULT_FN_ATTRS
_mm_store_pd1(double *__dp, __m128d __a)		_mm_store_pd1(double *__dp, __m128d __a)
{		{
return _mm_store1_pd(__dp, __a);		return _mm_store1_pd(__dp, __a);
}		}

/// \brief Stores a 128-bit vector of [2 x double] into an unaligned memory		/// \brief Stores a 128-bit vector of [2 x double] into an unaligned memory
/// location.		/// location.
▲ Show 20 Lines • Show All 1,868 Lines • ▼ Show 20 Lines	_mm_set1_epi8(char __b)
return (__m128i)(__v16qi){ __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b };		return (__m128i)(__v16qi){ __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b, __b };
}		}

/// \brief Constructs a 128-bit integer vector, initialized in reverse order		/// \brief Constructs a 128-bit integer vector, initialized in reverse order
/// with the specified 64-bit integral values.		/// with the specified 64-bit integral values.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VPUNPCKLQDQ / PUNPCKLQDQ </c>		/// This intrinsic does not correspond to a specific instruction.
		kromanovaUnsubmitted Not Done Reply Inline Actions It's better if you use the same language as for many intrinsics "before" and "after". Just for consistency purpose. /// This intrinsic is a utility function and does not correspond to a specific /// instruction. kromanova: It's better if you use the same language as for many intrinsics "before" and "after". Just for…
/// instruction.
///		///
/// \param __q0		/// \param __q0
/// A 64-bit integral value used to initialize the lower 64 bits of the		/// A 64-bit integral value used to initialize the lower 64 bits of the
/// result.		/// result.
/// \param __q1		/// \param __q1
/// A 64-bit integral value used to initialize the upper 64 bits of the		/// A 64-bit integral value used to initialize the upper 64 bits of the
/// result.		/// result.
/// \returns An initialized 128-bit integer vector.		/// \returns An initialized 128-bit integer vector.
▲ Show 20 Lines • Show All 154 Lines • ▼ Show 20 Lines	_mm_storeu_si128(__m128i *__p, __m128i __b)
} __attribute__((__packed__, __may_alias__));		} __attribute__((__packed__, __may_alias__));
((struct __storeu_si128*)__p)->__v = __b;		((struct __storeu_si128*)__p)->__v = __b;
}		}

/// \brief Moves bytes selected by the mask from the first operand to the		/// \brief Moves bytes selected by the mask from the first operand to the
/// specified unaligned memory location. When a mask bit is 1, the		/// specified unaligned memory location. When a mask bit is 1, the
/// corresponding byte is written, otherwise it is not written.		/// corresponding byte is written, otherwise it is not written.
///		///
/// To minimize caching, the date is flagged as non-temporal (unlikely to be		/// To minimize caching, the data is flagged as non-temporal (unlikely to be
/// used again soon). Exception and trap behavior for elements not selected		/// used again soon). Exception and trap behavior for elements not selected
/// for storage to memory are implementation dependent.		/// for storage to memory are implementation dependent.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VMASKMOVDQU / MASKMOVDQU </c>		/// This intrinsic corresponds to the <c> VMASKMOVDQU / MASKMOVDQU </c>
/// instruction.		/// instruction.
///		///
▲ Show 20 Lines • Show All 497 Lines • ▼ Show 20 Lines
/// Bits [127:96] are written to bits [127:96] of the destination.		/// Bits [127:96] are written to bits [127:96] of the destination.
/// \returns A 128-bit vector of [4 x i32] containing the interleaved values.		/// \returns A 128-bit vector of [4 x i32] containing the interleaved values.
static __inline__ __m128i __DEFAULT_FN_ATTRS		static __inline__ __m128i __DEFAULT_FN_ATTRS
_mm_unpackhi_epi32(__m128i __a, __m128i __b)		_mm_unpackhi_epi32(__m128i __a, __m128i __b)
{		{
return (__m128i)__builtin_shufflevector((__v4si)__a, (__v4si)__b, 2, 4+2, 3, 4+3);		return (__m128i)__builtin_shufflevector((__v4si)__a, (__v4si)__b, 2, 4+2, 3, 4+3);
}		}

/// \brief Unpacks the high-order (odd-indexed) values from two 128-bit vectors		/// \brief Unpacks the high-order 64-bit elements from two 128-bit vectors of
/// of [2 x i64] and interleaves them into a 128-bit vector of [2 x i64].		/// [2 x i64] and interleaves them into a 128-bit vector of [2 x i64].
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VPUNPCKHQDQ / PUNPCKHQDQ </c>		/// This intrinsic corresponds to the <c> VPUNPCKHQDQ / PUNPCKHQDQ </c>
/// instruction.		/// instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x i64]. \n		/// A 128-bit vector of [2 x i64]. \n
▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines	_mm_unpacklo_epi64(__m128i __a, __m128i __b)
return (__m128i)__builtin_shufflevector((__v2di)__a, (__v2di)__b, 0, 2+0);		return (__m128i)__builtin_shufflevector((__v2di)__a, (__v2di)__b, 0, 2+0);
}		}

/// \brief Returns the lower 64 bits of a 128-bit integer vector as a 64-bit		/// \brief Returns the lower 64 bits of a 128-bit integer vector as a 64-bit
/// integer.		/// integer.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic has no corresponding instruction.		/// This intrinsic corresponds to the <c> MOVDQ2Q </c> instruction.
		kromanovaUnsubmitted Not Done Reply Inline Actions I'm not sure about this change. Intel documentation says they generate MOVDQ2Q (don't have icc handy to try). However, I've tried on Linux/X86_64 with clang and gcc, - and we just return. kromanova: I'm not sure about this change. Intel documentation says they generate MOVDQ2Q (don't have icc…
		kromanovaUnsubmitted Not Done Reply Inline Actions Though I suspect it's possible to generate movdq2q, I couldn't come up with an test to trigger this instruction generation. Should we revert this change? __m64 fooepi64_pi64 (__m128i a, __m128 c) { __m64 x; x = _mm_movepi64_pi64 (a); return x; } on Linux we generate return instruction. I would expect (v)movq %xmm0,%rax to be generated instead of retq. Am I missing something? Why do we return 64 bit integer in xmm register rather than in %rax? kromanova: Though I suspect it's possible to generate movdq2q, I couldn't come up with an test to trigger…
		efriedmaUnsubmitted Not Done Reply Inline Actions The x86-64 calling convention rules say that __m64 is passed/returned in SSE registers. Try the following, which generates movdq2q: __m64 foo(__m128i a, __m128 c) { return _mm_add_pi8(_mm_movepi64_pi64(a), _mm_set1_pi8(5)); } efriedma: The x86-64 calling convention rules say that __m64 is passed/returned in SSE registers. Try…
		kromanovaUnsubmitted Not Done Reply Inline Actions Thanks! That explains it :) I can see that MOVDQ2Q gets generated. What about intrinsic below, _mm_movpi64_epi64? Can we ever generate MOVD+VMOVQ as stated in the review? Or should we write VMOVQ / MOVQ? kromanova: Thanks! That explains it :) I can see that MOVDQ2Q gets generated. What about intrinsic below…
		efriedmaUnsubmitted Not Done Reply Inline Actions Testcase: #include <immintrin.h> __m128 foo(__m128i a, __m128 c) { return _mm_movpi64_epi64(_mm_add_pi8(_mm_movepi64_pi64(a), _mm_set1_pi8(5))); } In this case, we should generate movq2dq, but currently don't (I assume due to a missing DAGCombine). I don't see how you could ever get MOVD... but see https://reviews.llvm.org/rL321898, which could be the source of some confusion. efriedma: Testcase: ``` #include <immintrin.h> __m128 foo(__m128i a, __m128 c) { return…
///		///
/// \param __a		/// \param __a
/// A 128-bit integer vector operand. The lower 64 bits are moved to the		/// A 128-bit integer vector operand. The lower 64 bits are moved to the
/// destination.		/// destination.
/// \returns A 64-bit integer containing the lower 64 bits of the parameter.		/// \returns A 64-bit integer containing the lower 64 bits of the parameter.
static __inline__ __m64 __DEFAULT_FN_ATTRS		static __inline__ __m64 __DEFAULT_FN_ATTRS
_mm_movepi64_pi64(__m128i __a)		_mm_movepi64_pi64(__m128i __a)
{		{
return (__m64)__a[0];		return (__m64)__a[0];
}		}

/// \brief Moves the 64-bit operand to a 128-bit integer vector, zeroing the		/// \brief Moves the 64-bit operand to a 128-bit integer vector, zeroing the
/// upper bits.		/// upper bits.
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VMOVQ / MOVQ / MOVD </c> instruction.		/// This intrinsic corresponds to the <c> MOVD+VMOVQ </c> instruction.
		kromanovaUnsubmitted Not Done Reply Inline Actions For Linux x86_64 I can only generate VMOVQ (or MOVQ) instructions respectively for AVX/non-AVX case. Can we even generate MOVD+VMOVQ? How we want to document this intrinsic? I have a similar question as above. __m128i foopi64_epi64 (__m64 a) { __m128i x; x = _mm_movpi64_epi64 (a); return x; } Why we generate this code vmovq %xmm0, %rax vmovq %rax, %xmm0 retq } instead of something simple like vmovq %rdi, %xmm0? kromanova: For Linux x86_64 I can only generate VMOVQ (or MOVQ) instructions respectively for AVX/non-AVX…
///		///
/// \param __a		/// \param __a
/// A 64-bit value.		/// A 64-bit value.
/// \returns A 128-bit integer vector. The lower 64 bits contain the value from		/// \returns A 128-bit integer vector. The lower 64 bits contain the value from
/// the operand. The upper 64 bits are assigned zeros.		/// the operand. The upper 64 bits are assigned zeros.
static __inline__ __m128i __DEFAULT_FN_ATTRS		static __inline__ __m128i __DEFAULT_FN_ATTRS
_mm_movpi64_epi64(__m64 __a)		_mm_movpi64_epi64(__m64 __a)
{		{
Show All 13 Lines
/// \returns A 128-bit integer vector. The lower 64 bits contain the value from		/// \returns A 128-bit integer vector. The lower 64 bits contain the value from
/// the operand. The upper 64 bits are assigned zeros.		/// the operand. The upper 64 bits are assigned zeros.
static __inline__ __m128i __DEFAULT_FN_ATTRS		static __inline__ __m128i __DEFAULT_FN_ATTRS
_mm_move_epi64(__m128i __a)		_mm_move_epi64(__m128i __a)
{		{
return __builtin_shufflevector((__v2di)__a, (__m128i){ 0 }, 0, 2);		return __builtin_shufflevector((__v2di)__a, (__m128i){ 0 }, 0, 2);
}		}

/// \brief Unpacks the high-order (odd-indexed) values from two 128-bit vectors		/// \brief Unpacks the high-order 64-bit elements from two 128-bit vectors of
/// of [2 x double] and interleaves them into a 128-bit vector of [2 x		/// [2 x double] and interleaves them into a 128-bit vector of [2 x
/// double].		/// double].
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VUNPCKHPD / UNPCKHPD </c> instruction.		/// This intrinsic corresponds to the <c> VUNPCKHPD / UNPCKHPD </c> instruction.
///		///
/// \param __a		/// \param __a
/// A 128-bit vector of [2 x double]. \n		/// A 128-bit vector of [2 x double]. \n
/// Bits [127:64] are written to bits [63:0] of the destination.		/// Bits [127:64] are written to bits [63:0] of the destination.
/// \param __b		/// \param __b
/// A 128-bit vector of [2 x double]. \n		/// A 128-bit vector of [2 x double]. \n
/// Bits [127:64] are written to bits [127:64] of the destination.		/// Bits [127:64] are written to bits [127:64] of the destination.
/// \returns A 128-bit vector of [2 x double] containing the interleaved values.		/// \returns A 128-bit vector of [2 x double] containing the interleaved values.
static __inline__ __m128d __DEFAULT_FN_ATTRS		static __inline__ __m128d __DEFAULT_FN_ATTRS
_mm_unpackhi_pd(__m128d __a, __m128d __b)		_mm_unpackhi_pd(__m128d __a, __m128d __b)
{		{
return __builtin_shufflevector((__v2df)__a, (__v2df)__b, 1, 2+1);		return __builtin_shufflevector((__v2df)__a, (__v2df)__b, 1, 2+1);
}		}

/// \brief Unpacks the low-order (even-indexed) values from two 128-bit vectors		/// \brief Unpacks the low-order 64-bit elements from two 128-bit vectors
/// of [2 x double] and interleaves them into a 128-bit vector of [2 x		/// of [2 x double] and interleaves them into a 128-bit vector of [2 x
/// double].		/// double].
///		///
/// \headerfile <x86intrin.h>		/// \headerfile <x86intrin.h>
///		///
/// This intrinsic corresponds to the <c> VUNPCKLPD / UNPCKLPD </c> instruction.		/// This intrinsic corresponds to the <c> VUNPCKLPD / UNPCKLPD </c> instruction.
///		///
/// \param __a		/// \param __a
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
/// This intrinsic corresponds to the <c> VSHUFPD / SHUFPD </c> instruction.		/// This intrinsic corresponds to the <c> VSHUFPD / SHUFPD </c> instruction.
///		///
/// \param a		/// \param a
/// A 128-bit vector of [2 x double].		/// A 128-bit vector of [2 x double].
/// \param b		/// \param b
/// A 128-bit vector of [2 x double].		/// A 128-bit vector of [2 x double].
/// \param i		/// \param i
/// An 8-bit immediate value. The least significant two bits specify which		/// An 8-bit immediate value. The least significant two bits specify which
/// elements to copy from a and b: \n		/// elements to copy from \a a and \a b: \n
/// Bit[0] = 0: lower element of a copied to lower element of result. \n		/// Bit[0] = 0: lower element of \a a copied to lower element of result. \n
/// Bit[0] = 1: upper element of a copied to lower element of result. \n		/// Bit[0] = 1: upper element of \a a copied to lower element of result. \n
/// Bit[1] = 0: lower element of \a b copied to upper element of result. \n		/// Bit[1] = 0: lower element of \a b copied to upper element of result. \n
/// Bit[1] = 1: upper element of \a b copied to upper element of result. \n		/// Bit[1] = 1: upper element of \a b copied to upper element of result. \n
/// \returns A 128-bit vector of [2 x double] containing the shuffled values.		/// \returns A 128-bit vector of [2 x double] containing the shuffled values.
#define _mm_shuffle_pd(a, b, i) __extension__ ({ \		#define _mm_shuffle_pd(a, b, i) __extension__ ({ \
(__m128d)__builtin_shufflevector((__v2df)(__m128d)(a), (__v2df)(__m128d)(b), \		(__m128d)__builtin_shufflevector((__v2df)(__m128d)(a), (__v2df)(__m128d)(b), \
0 + (((i) >> 0) & 0x1), \		0 + (((i) >> 0) & 0x1), \
2 + (((i) >> 1) & 0x1)); })		2 + (((i) >> 1) & 0x1)); })

▲ Show 20 Lines • Show All 131 Lines • Show Last 20 Lines