This is an archive of the discontinued LLVM Phabricator instance.

[X86] Prefer MOVSS/SD over BLENDI during legalization. Remove BLENDI versions of scalar arithmetic patterns
ClosedPublic

Authored by craig.topper on Sep 18 2017, 11:32 PM.

Details

Summary

We currently disable some converting of shuffles to MOVSS/MOVSD during legalization if SSE41 is enabled. But later during shuffle combining we go back to prefering MOVSS/MOVSD.

Additionally we have patterns that look for BLENDIs to detect scalar arithmetic operations. I believe due to the combining using MOVSS/MOVSD these are unnecessary.

Interestingly, we still codegen blend instructions even though lowering/isel emit movss/movsd instructions. Turns out machine CSE commutes them to blend, and then commuting those blends back into blends that are equivalent to the original movss/movsd.

This patch fixes the inconsistency in legalization to prefer MOVSS/MOVSD. The one test change was caused by this change. The problem is that we have integer types and are mostly selecting integer instructions except for the shufps. This shufps forced the execution domain, but the vpblendw couldn't have its domain changed with a naive instruction swap. We could fix this by special casing VPBLENDW based on the immediate to widen the element type.

The rest of the patch is removing all the excess scalar patterns.

Long term we should probably add isel patterns to make MOVSS/MOVSD emit blends directly instead of relying on the double commute. We may also want to consider emitting movss/movsd for optsize. I also wonder if we should still use the VEX encoded blendi instructions even with AVX512. Blends have better throughput, and that may outweigh the register constraint.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Sep 18 2017, 11:32 PM
RKSimon edited edge metadata.Sep 19 2017, 2:31 AM

Making a note here as I think PR33079 was having problems with the MOVSS<->BLENDPS style commutation

@RKSimon do you have any concerns with this patch?

@RKSimon do you have any concerns with this patch?

Partially - I'm just wondering how this might affect PR33079 (which funnily enough had a new report today).

What is the state of this now that D38449 has been committed?

Looks like this patch is still valid after D38449.

RKSimon accepted this revision.Oct 8 2017, 3:32 AM

LGTM, please can you raise bugs for both custom handling of domain changes (PBLENDW <-> BLENDPS etc.) and adding isel patterns (optsize or not) for MOVSS/MOVSD with BLENDPS/BLENDPD?

This revision is now accepted and ready to land.Oct 8 2017, 3:32 AM
This revision was automatically updated to reflect the committed changes.