This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX] Update _mm256_loadu2_m128* intrinsics to use _mm256_set_m128* (PR51796)
ClosedPublic

Authored by RKSimon on Sep 9 2021, 3:40 AM.

Details

Summary

As reported on PR51796, the _mm256_loadu2_m128i in particular was inserting bitcasts and shuffles with different types making it trickier for some combines, and prevented the value tracker from identifying the shuffle sequences as a single insert_subvector style concat_vectors pattern.

This patch instead concatenate the 128-bit unaligned loads with _mm256_set_m128*, which was written to avoid the unnecessary bitcasts and only emits a single shuffle.

Diff Detail

Event Timeline

RKSimon requested review of this revision.Sep 9 2021, 3:40 AM
RKSimon created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptSep 9 2021, 3:40 AM
This revision is now accepted and ready to land.Sep 9 2021, 9:28 AM
This revision was landed with ongoing or failed builds.Sep 9 2021, 11:20 AM
This revision was automatically updated to reflect the committed changes.