This is an archive of the discontinued LLVM Phabricator instance.

[X86] Make the STTNI flag intrinsics use the flags from pcmpestrm/pcmpistrm if the mask instrinsics are also used in the same basic block.
ClosedPublic

Authored by craig.topper on Apr 27 2018, 11:28 AM.

Details

Summary

Previously the flag intrinsics always used the index instructions even if a mask instruction also exists.

To fix fix this I've created a single ISD node type that returns index, mask, and flags. The SelectionDAG CSE process will merge all flavors of intrinsics with the same inputs to a s ingle node. Then during isel we just have to look at which results are used to know what instruction to generate. If both mask and index are used we'll need to emit two instructions. But for all other cases we can emit a single instruction.

Since I had to do manual isel anyway, I've removed the pseudo instructions and custom inserter code that was working around tablegen limitations with multiple implicit defs.

I've also renamed the recently added sse42.ll test case to sttni.ll since it focuses on that subset of the sse4.2 instructions.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Apr 27 2018, 11:28 AM
chandlerc accepted this revision.Apr 27 2018, 11:43 AM

LGTM, nice!

This revision is now accepted and ready to land.Apr 27 2018, 11:43 AM

Don't fold non-temporal loads. Add test case to make sure we don't fold load when emitting mask and index at the same time. Move operand extraction into the helper methods so we don't have to pass so many parameters.

This revision was automatically updated to reflect the committed changes.
llvm/trunk/lib/Target/X86/X86InstrSSE.td