This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Split large truncating MVE stores
ClosedPublic

Authored by dmgreen on Sep 20 2019, 4:35 AM.

Details

Summary

MVE does not have a simple sign extend instruction that can move elements across lanes. We currently often end up moving each lane into and out of a GPR, in order to get elements into the correct places. When we have a store of a trunc (or a extend of a load), we can instead just split the store/load in two, using the narrowing/widening load/store instructions from each half of the vector.

This does that for stores. It happens very early in a store combine, so as to easily detect the truncates. (It would be possible to do this later, but that would involve looking through a buildvector of extract elements. Not impossible but this way seemed simpler).

By enabling store combines we also get a vmovdrr combine for free, helping some other tests.

Diff Detail

Event Timeline

dmgreen created this revision.Sep 20 2019, 4:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 20 2019, 4:35 AM
samparker added inline comments.Sep 20 2019, 6:29 AM
llvm/lib/Target/ARM/ARMISelLowering.cpp
13172–13173

To make this function easier to read, how about extracting the truncating neon and non-truncating mve stuff out into their own functions?

13306

Nit: I'm guessing this is taken from some load code, so I think Ext should be renamed to something a bit more descriptive.

13317

Nit: NumElements..?

13327

old load?!

dmgreen marked 2 inline comments as done.Sep 23 2019, 4:02 AM
dmgreen added inline comments.
llvm/lib/Target/ARM/ARMISelLowering.cpp
13317

Yeah, sounds a lot better.

13327

Sounds like I should put up the load patch too! It's roughly similar to this, so I'll try and make the same changes there.

dmgreen updated this revision to Diff 221284.Sep 23 2019, 4:37 AM

Addressed comments.

samparker accepted this revision.Sep 23 2019, 5:09 AM

Thanks, LGTM

This revision is now accepted and ready to land.Sep 23 2019, 5:09 AM
This revision was automatically updated to reflect the committed changes.