This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X)
ClosedPublic

Authored by RKSimon on May 8 2017, 10:53 AM.

Details

Summary

Similar to what we do for vXi8 ASHR(X, 7), use SSE42's PCMPGTQ to splat the sign instead of using the PSRAD+PSHUFD.

Avoiding bitcasts this improves combines that utilize computeNumSignBits, permits memory folding and reduces pipe pressure.

Although it does require a second register, given that this is a (cheap) zero register the impact is minimal.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.May 8 2017, 10:53 AM
delena added inline comments.May 8 2017, 11:16 AM
lib/Target/X86/X86ISelLowering.cpp
21748 ↗(On Diff #98187)

You don't need "else".
if VT is v4i64 you need AVX2 to emit the PCMPGT.

RKSimon added inline comments.May 8 2017, 11:22 AM
lib/Target/X86/X86ISelLowering.cpp
21748 ↗(On Diff #98187)

You don't need "else".

OK

if VT is v4i64 you need AVX2 to emit the PCMPGT.

This is handled by the callers to ArithmeticShiftRight64 further down - I can still add it if you wish?

delena added inline comments.May 8 2017, 11:33 AM
lib/Target/X86/X86ISelLowering.cpp
21748 ↗(On Diff #98187)

you can add assert()

21806 ↗(On Diff #98187)

The same code here ..

RKSimon updated this revision to Diff 98193.May 8 2017, 12:07 PM

Added v4i64/AVX2 assertion

RKSimon marked an inline comment as done.May 8 2017, 12:09 PM
RKSimon added inline comments.
lib/Target/X86/X86ISelLowering.cpp
21806 ↗(On Diff #98187)

I looked at merging the vXi64/vXi8 paths for this but it didn't seem worth it - AVX512 always has PSRAQ so just SSE/AVX can be done as a single line.

delena accepted this revision.May 9 2017, 5:24 AM
This revision is now accepted and ready to land.May 9 2017, 5:24 AM
This revision was automatically updated to reflect the committed changes.