This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Enable vector extload combine for legal types.
ClosedPublic

Authored by ab on Feb 4 2015, 6:35 PM.

Details

Summary

This patch enables forming vector extloads (introduced in D6904) for ARM.
It only does so for legal types, and when we can't fold the extension in a wide/long form of the user instruction.

Enabling it for larger types isn't as good an idea on ARM as it is on X86, because:

  1. we pretend that extloads are legal, but end up generating vld+vmov, and
  2. we have instructions like vld {dN, dM}, which can't be generated when we "manually expand" extloads to vld+vmov.

For instance, for a 16i16 -> 16i64 sextload, we generate something like:
...
vld1.64 {d16, d17}, [r2:128]
vmovl.s16 q9, d16
vmovl.s16 q8, d17
...

Whereas with the combine enabled for illegal types, we would generate:
...
vld1.32 {d18[0]}, [r1:32]
...
vmovl.s16 q9, d18
...

For legal types, the combine doesn't fire that often: in the integration tests only in a big endian testcase, where it removes a pointless AND.

Diff Detail

Event Timeline

ab updated this revision to Diff 19369.Feb 4 2015, 6:35 PM
ab retitled this revision from to [ARM] Enable vector extload combine for legal types..
ab updated this object.
ab edited the test plan for this revision. (Show Details)
ab added a subscriber: Unknown Object (MLST).
jmolloy accepted this revision.Feb 5 2015, 1:04 AM
jmolloy added a reviewer: jmolloy.
jmolloy added a subscriber: jmolloy.

Hi Ahmed,

This looks fine to me. I'd prefer a testcase that shows this optimization triggering, but perhaps that is not possible.

Cheers,

James

This revision is now accepted and ready to land.Feb 5 2015, 1:04 AM
This revision was automatically updated to reflect the committed changes.
ab added a comment.Mar 5 2015, 11:43 AM

r231396, thanks for the review!

I tried harder to tickle the combine, but I couldn't write a more
explicit test :/

-Ahmed