This is an archive of the discontinued LLVM Phabricator instance.

[CodeGenPrepare] shift both sides of a vector select when profitable
ClosedPublic

Authored by spatel on Jun 12 2019, 4:29 PM.

Details

Summary

This is based on the example/discussion in PR37428:
https://bugs.llvm.org/show_bug.cgi?id=37428

Proper vector shift instructions don't appear until AVX2, so we may generate several extra instructions within a loop trying to compensate for that. It's difficult to recover from that expansion later than this, so use the existing TLI hook and splat analysis to enable better codegen.

Diff Detail

Event Timeline

spatel created this revision.Jun 12 2019, 4:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2019, 4:29 PM
spatel planned changes to this revision.Jun 14 2019, 7:19 AM

For reference, the vector shift TLI hook was originally added for a related CGP transform:
rL201655

Also, I just noticed that neither that patch nor this propagates debuginfo (we're now trying harder to improve the debugging experience even with optimized code). I'll add some lines to change/test that.

spatel updated this revision to Diff 204781.Jun 14 2019, 9:12 AM

Patch updated with test changes:

  1. Added a bigger test for the original PR37428 example because that shows a difference: with AVX1, we do not split the ymm ops any more. I'm not sure if that's better or worse than having mul/sitofp ops in the loop, but since the original request is for an SSE target, I think we can deal with that independently.
  2. Added an "enable-debugify" RUN to the IR test file to verify that we are not creating naked instructions. The IRBuilder takes care of this for us here, but I fixed a related problem in rL363409.
lebedev.ri accepted this revision.Jun 14 2019, 3:46 PM

This does look good to me, thanks.

This revision is now accepted and ready to land.Jun 14 2019, 3:46 PM
RKSimon accepted this revision.Jun 15 2019, 3:34 AM

LGTM - cheers

This revision was automatically updated to reflect the committed changes.