This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Distribute post-inc for Thumb2 sign/zero extending loads/stores
ClosedPublic

Authored by dmgreen on Apr 22 2020, 4:31 AM.

Details

Summary

This adds sign/zero extending scalar loads/stores to the MVE instructions added in D77813, allowing us to pick up more post-inc situations. These are comparatively simple, compared to LDR/STR (which may be better turned into an LDRD/LDM), but still require some additions over MVE instructions. Because there are i12 and i8 variants of the offset loads/stores dealing with different signs, we may need to convert an i12 address to a i8 negative instruction.

Diff Detail

Event Timeline

dmgreen created this revision.Apr 22 2020, 4:31 AM
dmgreen retitled this revision from [ARM] Distribute post-inc for to [ARM] Distribute post-inc for Thumb2 sing/zero extending loads/stores.Apr 22 2020, 4:31 AM
samparker accepted this revision.May 6 2020, 12:23 AM

LGTM. Have you tested whether this approach is faster than doing the pre-indexing LSR method? I certainly like the idea of this transform instead of being beholden to the filtering gods.

This revision is now accepted and ready to land.May 6 2020, 12:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 6 2020, 12:23 AM
dmgreen updated this revision to Diff 282163.Jul 31 2020, 2:42 AM
dmgreen retitled this revision from [ARM] Distribute post-inc for Thumb2 sing/zero extending loads/stores to [ARM] Distribute post-inc for Thumb2 sign/zero extending loads/stores.

Now includes a quick codesize metric, to try and detect cases where a t2LDRi12 can be shrunk to tLDRi.

I think we are still at the behests of LSR's cost modeling I'm afraid. This can just do slightly better at fixing up the results afterwards, it can slightly improve things in case ISel comes up with something unoptimal.