This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Fold add(shuffle(),shuffle()) to hadd on 'slow' targets (PR39920)
ClosedPublic

Authored by RKSimon on Apr 30 2019, 6:43 AM.

Details

Summary

As reported on PR39920, "slow horizontal ops" targets tend to internally expand to 2*shuffle+add/sub - so if we can reduce 2*shuffle+add/sub to a hadd/sub then we should do it - similar port usage but reduced instruction count.

This works out in most cases, although the "PR22377" regression in vector-shuffle-combining.ll is annoying - going from 2*shuffle+add+shuffle to hadd+2*shuffle - I'm open to suggestions - I've been trying to think of ways to get foldShuffleOfHorizOp to work with general shuffles but haven't found anything yet.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Apr 30 2019, 6:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2019, 6:43 AM
RKSimon retitled this revision from [X86][SSE] Fold add(shuffle(),shuffle()) on 'slow' targets (PR39920) to [X86][SSE] Fold add(shuffle(),shuffle()) to hadd on 'slow' targets (PR39920).Apr 30 2019, 6:50 AM
spatel added a comment.May 8 2019, 5:40 AM

Is the PR22377 test the only remaining problem? If so, do we have a new bug to track that (or reopen the old bug)?

Is the PR22377 test the only remaining problem? If so, do we have a new bug to track that (or reopen the old bug)?

Yes its just the PR22377 test case.

I've raised https://bugs.llvm.org/show_bug.cgi?id=41813 but I'm not totally happy with it being so vague on what the best thing to do is.

spatel accepted this revision.May 9 2019, 8:27 AM

Is the PR22377 test the only remaining problem? If so, do we have a new bug to track that (or reopen the old bug)?

Yes its just the PR22377 test case.

I've raised https://bugs.llvm.org/show_bug.cgi?id=41813 but I'm not totally happy with it being so vague on what the best thing to do is.

Thanks. We're probably not going to regress the actual motivating example (AVX codegen) in PR22377 (although we could do better), so I think we're fine here. LGTM.

This revision is now accepted and ready to land.May 9 2019, 8:27 AM
This revision was automatically updated to reflect the committed changes.