This is an archive of the discontinued LLVM Phabricator instance.

[x86, InstCombine] transform x86 AVX masked stores to LLVM intrinsics
ClosedPublic

Authored by spatel on Feb 20 2016, 4:42 PM.

Details

Summary

The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145

is that customers using the AVX intrinsics in C will benefit from combines when the store mask is constant:

void mstore_zero_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set1_epi32(0), v);
}

void mstore_fake_ones_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set1_epi32(1), v);
}

void mstore_ones_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v);
}

void mstore_one_set_elt_mask(float *f, __m128 v) {
  _mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v);
}

...so none of the above will actually generate a masked store for optimized code.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 48602.Feb 20 2016, 4:42 PM
spatel retitled this revision from to [x86, InstCombine] transform x86 AVX masked stores to LLVM intrinsics.
spatel updated this object.
spatel added reviewers: RKSimon, delena, craig.topper.
spatel added a subscriber: llvm-commits.
RKSimon edited edge metadata.Feb 20 2016, 4:58 PM

Couple of minor queries.

lib/Transforms/InstCombine/InstCombineCalls.cpp
824 ↗(On Diff #48602)

Not a priority but can the blendv instcombines share this helper (or something very similar)?

1609 ↗(On Diff #48602)

If this returns nullptr shouldn't the case break?

spatel added inline comments.Feb 21 2016, 8:23 AM
lib/Transforms/InstCombine/InstCombineCalls.cpp
824 ↗(On Diff #48602)

Yes - I'll do that as a precursor or follow-up patch.

1609 ↗(On Diff #48602)

This is interesting - simplifyX86MaskedStore() *always* returns nullptr because eraseInstFromFunction() always returns nullptr. I'll change it to return a bool to indicate if a change was made.

spatel updated this revision to Diff 48627.Feb 21 2016, 9:47 AM
spatel edited edge metadata.

Patch updated:

  1. I checked in the sign bit -> bool helper function in http://reviews.llvm.org/rL261483
  2. Updated simplifyX86MaskedStore() to return 'true' when it makes a change; this allows the caller's 'case' to break and try other optimizations.

I'm happy with this - does anyone else have any comments?

delena added inline comments.Feb 24 2016, 11:26 PM
lib/Transforms/InstCombine/InstCombineCalls.cpp
854 ↗(On Diff #48627)

I think you lost address space here.
I know that address space is not handled properly in masked operations and there is an open discussion in LLVM-dev right now.
I think it will be resolved sooner or later. I suggest to keep the original address space anyway.

spatel updated this revision to Diff 49067.Feb 25 2016, 8:09 AM

Patch updated:
Fixed to propagate address space of original pointer argument.

spatel marked an inline comment as done.Feb 25 2016, 8:11 AM
spatel added inline comments.
lib/Transforms/InstCombine/InstCombineCalls.cpp
855 ↗(On Diff #49067)

Thanks - I didn't think it was possible to specify the address space with the AVX intrinsic, but we should do the right thing here anyway.

delena accepted this revision.Feb 26 2016, 11:26 AM
delena edited edge metadata.

LGTM

This revision is now accepted and ready to land.Feb 26 2016, 11:26 AM
This revision was automatically updated to reflect the committed changes.
spatel marked an inline comment as done.