MOVMSK only care about the sign bit so we don't need the setcc to fill the whole element with 0s/1s. We can just shift the bit we're looking for into the sign bit. This saves a constant pool load.
Inspired by PR38840.
Differential D52121
[X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)) ClosedPublic Authored by craig.topper on Sep 14 2018, 1:19 PM.
Details Summary MOVMSK only care about the sign bit so we don't need the setcc to fill the whole element with 0s/1s. We can just shift the bit we're looking for into the sign bit. This saves a constant pool load. Inspired by PR38840.
Diff Detail Event Timelinelebedev.ri added inline comments.
Comment Actions
Sorry - that was supposed to be for D52109 Comment Actions LGTM with one minor
This revision is now accepted and ready to land.Sep 15 2018, 3:51 AM Closed by commit rL342326: [X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)) (authored by ctopper). · Explain WhySep 15 2018, 9:24 AM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 165584 test/CodeGen/X86/movmsk-cmp.ll
|
Are these and masks necessary - we're only using the signbit?