This is an archive of the discontinued LLVM Phabricator instance.

[ARM] MVE big endian loads/stores
ClosedPublic

Authored by dmgreen on Aug 1 2019, 8:30 AM.

Details

Summary

This adds some missing patterns for big endian loads/stores, allowing unaligned loads/stores to also be selected through with an extra VREV, which produces better code than aligning through a stack. Also moves VLDR_P0 to not be LE only, and adjusts some of the tests to show all that working.

Diff Detail

Repository
rL LLVM

Event Timeline

dmgreen created this revision.Aug 1 2019, 8:30 AM
samparker added inline comments.Aug 8 2019, 1:22 AM
llvm/lib/Target/ARM/ARMInstrMVE.td
4789 ↗(On Diff #212823)

Do we support v2i1 as well?

llvm/test/CodeGen/Thumb2/mve-loadstore.ll
16 ↗(On Diff #212823)

So why not vrev32? I'm having real difficulty understanding why these vrev instructions are augmented with 'size'.

simon_tatham added inline comments.Aug 8 2019, 1:27 AM
llvm/lib/Target/ARM/ARMInstrMVE.td
4789 ↗(On Diff #212823)

Currently, v2i1 isn't listed among the vector-of-i1 types that are legal in the VCCR regclass, so it probably wouldn't help to add a pattern for it here.

I have a small patch that adds it in various other places, as part of my unfinished prototype for the ACLE MVE intrinsics support. If it's becoming urgent, I could easily pull that patch out and submit it before I get the rest finished?

samparker added inline comments.Aug 8 2019, 1:51 AM
llvm/lib/Target/ARM/ARMInstrMVE.td
4789 ↗(On Diff #212823)

Great, sounds like a good thing to get in so we can decouple autoveec work from intrinsic support.

dmgreen marked 3 inline comments as done.Aug 8 2019, 1:56 AM
dmgreen added inline comments.
llvm/lib/Target/ARM/ARMInstrMVE.td
4789 ↗(On Diff #212823)

I have not added v2i1 to anything yet. It is not usually very useful, as we do not support any of the cmp's needed to produce it. Adding it sound like the kind of thing that would need a lot of testing.

llvm/test/CodeGen/Thumb2/mve-loadstore.ll
16 ↗(On Diff #212823)

These are just because we are returning a <4 x i32>, and the calling convention is a little odd at the moment. See the other BE patch in D65581.

The way I think of the vrevs is that the do 2 different bit level reverses. This one does on to i64's, then they are reversed again in i32's. So you end up with the bytes in the correct order, but re-arranged.

Point is that this one is not really the interesting part of this test. It is just an artifact of the calling convention.

34 ↗(On Diff #212823)

This one here is an example of load being bit-reversed, because it's alignment is too low. It's loaded with i8's and VREV'd to get the correct values into the i32's.

simon_tatham added inline comments.Aug 8 2019, 2:25 AM
llvm/lib/Target/ARM/ARMInstrMVE.td
4789 ↗(On Diff #212823)

No sooner said: D65929 is up for review.

samparker accepted this revision.Aug 8 2019, 2:59 AM

What an eye opener...

llvm/test/CodeGen/Thumb2/mve-loadstore.ll
16 ↗(On Diff #212823)

cheers!

This revision is now accepted and ready to land.Aug 8 2019, 2:59 AM
This revision was automatically updated to reflect the committed changes.