This is an archive of the discontinued LLVM Phabricator instance.

[ARM] - Fix lowering of shufflevectors in AArch32
ClosedPublic

Authored by LukeCheeseman on Jul 22 2015, 3:14 AM.

Details

Summary

Some shufflevectors are currently being incorrectly lowered in the AArch32 backend as the existing checks for detecting the NEON operations from the shufflevector instruction expects the shuffle mask and the vector operands to be of the same length.

This is not always the case as the mask may be twice as long as the operand; here only the lower half of the shufflemask gets checked, so provided the lower half of the shufflemask looks like a vector transpose (or even is just all -1 for undef) then the intrinsics may get incorrectly lowered into a vector transpose (VTRN) instruction.

This patch fixes this by accommodating for both cases and adds regression tests.

Diff Detail

Event Timeline

LukeCheeseman retitled this revision from to [ARM] - Fix lowering of shufflevectors in AArch32.
LukeCheeseman updated this object.
LukeCheeseman added a subscriber: llvm-commits.

Hi Charlie,

The patch looks good to me, but it seems like it could do with some more tests. For example, you have undef for vext but not defined. And the other way around for vzip. You also changed vuzp masks, but haven't added double-size tests for them.

I'm expecting that the regular sized vtrn/vzip/vuzp tests are already somewhere else. Why did you create a new test for these? Why not just append on the existing one?

cheers,
--renato

Adding more tests to check lowering of more shufflemask patterns. Also, slight change to how WhichResult is set, if the first element of the shufflemask was undef then it would incorrectly set WhichResult to 1 when it should be 0 (in the double length shufflemask case)

rengolin accepted this revision.Jul 23 2015, 1:07 PM
rengolin added a reviewer: rengolin.

LGTM. Thanks!

This revision is now accepted and ready to land.Jul 23 2015, 1:07 PM
This revision was automatically updated to reflect the committed changes.