This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] lowerAddSubToHorizontalOp - enable ymm extraction+fold
ClosedPublic

Authored by RKSimon on May 2 2019, 3:35 AM.

Details

Summary

Limiting scalar hadd/hsub generation to the lowest xmm looks to be unnecessary - we will be extracting one upper xmm whatever, and we can remove a shuffle by using the hop which is inline with what shouldUseHorizontalOp expects to happen anyway.

Testing on btver2 (the main target for fast-hops) shows this is beneficial even for float ops where we have a 'shuffle' to extract the float result:
https://godbolt.org/z/0R-U-K

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.May 2 2019, 3:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2019, 3:35 AM
spatel accepted this revision.May 2 2019, 6:36 AM

LGTM

This revision is now accepted and ready to land.May 2 2019, 6:36 AM
This revision was automatically updated to reflect the committed changes.