This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add an AND with 255 to the v16i8 LowerMUL path with AVX2, but not AVX512
AbandonedPublic

Authored by craig.topper on Nov 16 2018, 3:55 PM.

Details

Reviewers
RKSimon
spatel
Summary

This will coax the truncate lowering to emit an extract_subvector and a packuswb instead of an extract_subvector, 2 pshufbs, and a punpcklqdq. But don't do this if we have an AVX512 truncate instruction available.

Diff Detail

Event Timeline

craig.topper created this revision.Nov 16 2018, 3:55 PM
craig.topper added inline comments.Nov 16 2018, 4:05 PM
test/CodeGen/X86/vector-reduce-mul.ll
2257

This an extra truncate on the last step. Maybe need some SimplifyDemandedBits/SimplifyDemandedVectorElts enhancement here?

2802

Another extra truncate

It looks OK, but wouldn't we be better off trying to improve the truncation lowering?

craig.topper abandoned this revision.Nov 17 2018, 4:02 PM

Abandoning in favor of D54671