This is an archive of the discontinued LLVM Phabricator instance.

[X86] Update fast-isel tests for changes from D48346.
ClosedPublic

Authored by craig.topper on Jun 19 2018, 10:28 PM.

Details

Summary

The new IR fixes a mismatch in the final extractelement for the i32 intrinsics. Previously we extracted a 64-bit element even though we only wanted 32 bits.

SimplifyDemandedElts isn't able to make FP elements undef now and the shuffle mask I used prevents the use of horizontal add we had before. Not sure we should have been using horizontal add anyway. It's implemented on Intel with two port 5 shuffles and an add. So we have on less shuffle now, but an additional instruction to decode.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Jun 19 2018, 10:28 PM
RKSimon accepted this revision.Jun 20 2018, 3:56 AM

LGTM - avx512-reduceMinMaxIntrin.c codegen could be cleaned up a bit to match these better - we don't typically include all the alloca etc. from -O0 code

This revision is now accepted and ready to land.Jun 20 2018, 3:56 AM
This revision was automatically updated to reflect the committed changes.