This is an archive of the discontinued LLVM Phabricator instance.

[X86] Improve lowering of v2i64 sign bit tests on pre-sse4.2 targets
ClosedPublic

Authored by craig.topper on Jan 6 2020, 1:17 PM.

Details

Summary

Without sse4.2 a v2i64 setlt needs to expand into a pcmpgtd, pcmpeqd, 3 shuffles, and 2 logic ops. But if we're only interested in the sign bit of the i64 elements, we can just use one pcmpgtd and shuffle the odd elements to the even elements.

Diff Detail

Event Timeline

craig.topper created this revision.Jan 6 2020, 1:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 6 2020, 1:17 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
craig.topper marked an inline comment as done.Jan 6 2020, 1:19 PM
craig.topper added inline comments.
llvm/test/CodeGen/X86/bitcast-vector-bool.ll
432–436

The new code is simple enough that simplify demanded bits was able to get through it. The movmskb only needs the sign bits from its input and packss doesn't alter sign bits so it was able to prove the compare unnecessary.

RKSimon added inline comments.Jan 6 2020, 2:16 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
21595

emit PSHUFD directly?

21598

Would the invert case be better as a PSRAD xmm, 31 + the shuffle ?

craig.topper marked 2 inline comments as done.Jan 6 2020, 2:36 PM
craig.topper added inline comments.
llvm/lib/Target/X86/X86ISelLowering.cpp
21595

Why? The code below generates 3 regular shuffles just like this.

21598

I think PSRAD xmm, 31 is equivalent to the non-inverted case. I'm not sure which is better XOR+PCMPGT or PSRAD. There were more execution resources available for pcmpgt then psrad on SSE4.1 era CPU like Penryn. But we might not have handled the XOR in 0 cycles.

Add a better solution for the inverted case. Though I don't have a test case.

Just skip the invert case. I don't think its as likely to occur. Other canonicalizations should have prevented it I think.

Fix formatting issue.

RKSimon accepted this revision.Jan 7 2020, 1:07 AM

LGTM

This revision is now accepted and ready to land.Jan 7 2020, 1:07 AM