This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine][x86] Constant fold psll intrinsics.
ClosedPublic

Authored by Bigcheese on Apr 11 2014, 11:24 AM.

Details

Summary

The psll{,i}{w,d,q} instruction is almost a vector shl however, it has defined
behavior of evaluating to 0 for shifts greater than the bitwidth of the elements.
We can’t currently represent this directly in llvm without generating extra
code, but we can handle the constant case.

This excludes avx512 as I don't have hardware to verify. It excludes _dq
variants because they are represented in the IR as <{2,4} x i64> when it's
actually a byte shift of the entire i{128,265}.

This also excludes _dq_bs as they aren't at all supported by the backend.
There are also no corresponding instructions in the ISA. I have no idea why
they exist...

Diff Detail

Event Timeline

Seems reasonable. Shouldn't we do the same thing for psllw, psllq, vpsll[wdq], though?

I agree with Jim.
You can also do something similar to simplify packed logical shift right instructions ( psrlw/ psrld/ psrlq/vpsrlw/ vpsrld/ vpsrlq ).
SSE2/AVX2 packed logical shift right instructions also evaluate to 0 if the shift count is greater than or equal to the element size.

Bigcheese planned changes to this revision.Apr 11 2014, 12:12 PM

Yes, I'll add the others. Seems reasonable to have them all in the same commit.

Bigcheese updated this revision to Unknown Object (????).Apr 15 2014, 5:48 PM

Excellent.

Nadav commented on a separate patch that he's interested in these sorts of things being target DAG combines rather than InstCombines. Might want to check with him to get a bit more info on his thoughts about that.

Bigcheese accepted this revision.Apr 23 2014, 6:09 PM
Bigcheese added a reviewer: Bigcheese.
This revision is now accepted and ready to land.Apr 23 2014, 6:09 PM
Bigcheese closed this revision.Apr 23 2014, 6:10 PM

Committed as r207058. Approved by Nadav.