This is an archive of the discontinued LLVM Phabricator instance.

[Clang][AVX512][Intrinsics]Convert AVX non-temporal store builtins to LLVM-native IR.
ClosedPublic

Authored by m_zuckerman on May 18 2016, 4:33 AM.

Details

Reviewers
craig.topper

Diff Detail

Event Timeline

m_zuckerman retitled this revision from to Convert AVX non-temporal store builtins to LLVM-native IR..
m_zuckerman updated this object.
m_zuckerman added a reviewer: craig.topper.
m_zuckerman added subscribers: delena, cfe-commits.
m_zuckerman retitled this revision from Convert AVX non-temporal store builtins to LLVM-native IR. to [Clang][AVX512][Intrinsics]Convert AVX non-temporal store builtins to LLVM-native IR..May 18 2016, 4:36 AM
m_zuckerman added subscribers: AsafBadouh, igorb.
craig.topper accepted this revision.May 23 2016, 10:18 PM
craig.topper edited edge metadata.

LGTM

This revision is now accepted and ready to land.May 23 2016, 10:18 PM
RKSimon added a subscriber: RKSimon.Jun 1 2016, 3:05 PM

Is there any reason why we can't just get rid of all the SSE movnt builtins and use __builtin_nontemporal_store instead (D12313)?

ab added a subscriber: ab.Jun 1 2016, 3:14 PM

Is there any reason why we can't just get rid of all the SSE movnt builtins and use __builtin_nontemporal_store instead (D12313)?

I wanted to suggest that too, but I think you'd have problems with the (natural?) alignment requirement of __builtin_nontemporal_store (whereas IIRC, movnti & friends accept unaligned pointers).

ab added a comment.Jun 1 2016, 3:16 PM
In D20358#446218, @ab wrote:

Is there any reason why we can't just get rid of all the SSE movnt builtins and use __builtin_nontemporal_store instead (D12313)?

I wanted to suggest that too, but I think you'd have problems with the (natural?) alignment requirement of __builtin_nontemporal_store (whereas IIRC, movnti & friends accept unaligned pointers).

But now that I look at this again, I suppose we could have some attribute((aligned(1))), or something like r271214.

In D20358#446220, @ab wrote:
In D20358#446218, @ab wrote:

Is there any reason why we can't just get rid of all the SSE movnt builtins and use __builtin_nontemporal_store instead (D12313)?

I wanted to suggest that too, but I think you'd have problems with the (natural?) alignment requirement of __builtin_nontemporal_store (whereas IIRC, movnti & friends accept unaligned pointers).

But now that I look at this again, I suppose we could have some attribute((aligned(1))), or something like r271214.

True, luckily that only affects _mm_stream_si32 and _mm_stream_si64 - the 'real' vector movnt stores all require type alignment. The _mm_stream_load_* (movntdqa) loads cases should be trivial as well.

In D20358#446220, @ab wrote:
In D20358#446218, @ab wrote:

Is there any reason why we can't just get rid of all the SSE movnt builtins and use __builtin_nontemporal_store instead (D12313)?

I wanted to suggest that too, but I think you'd have problems with the (natural?) alignment requirement of __builtin_nontemporal_store (whereas IIRC, movnti & friends accept unaligned pointers).

But now that I look at this again, I suppose we could have some attribute((aligned(1))), or something like r271214.

True, luckily that only affects _mm_stream_si32 and _mm_stream_si64 - the 'real' vector movnt stores all require type alignment. The _mm_stream_load_* (movntdqa) loads cases should be trivial as well.

I've created D21272 that covers the conversion of SSE/SSE2/AVX/AVX512 non-temporal aligned vector stores to use __builtin_nontemporal_store in headers

D21272 has now been committed, which I think removes the need for this patch. D20359 is still needed (with the additional tests requested by Craig).

Abandon this patch? We replaced the x86 vector non-temporal store builtins with __builtin_nontemporal_store directly in the headers.

m_zuckerman closed this revision.Jul 31 2017, 11:08 PM