Page MenuHomePhabricator

[llvm][sve] Lowering for VLS extending loads
ClosedPublic

Authored by DavidTruby on Jul 29 2021, 4:57 AM.

Details

Summary

This patch enables extending loads for fixed length SVE code generation.

There is a slight regression here in the mulh tests; since these tests
load the parameter and then extend it these are treated as extending
loads which are merged, preventing the mulh instruction from being
generated. As this affects scalable SVE codegen as well this should be
addressed in a separate patch.

Diff Detail

Event Timeline

DavidTruby created this revision.Jul 29 2021, 4:57 AM
DavidTruby requested review of this revision.Jul 29 2021, 4:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 29 2021, 4:57 AM
Matt added a subscriber: Matt.Jul 29 2021, 5:14 AM
bsmith added inline comments.Aug 3 2021, 8:41 AM
llvm/test/CodeGen/AArch64/sve-fixed-length-ext-loads.ll
51–64

The codegen in the type legalisation cases seems a bit odd, why is this not using SVE to do the extending load?

efriedma added inline comments.Aug 3 2021, 11:58 AM
llvm/test/CodeGen/AArch64/sve-fixed-length-ext-loads.ll
51–64

The fact that legalization goes through the stack is obviously just a missed optimization.

The way type legalization works, it will see that <16 x i16> is legal, so we do a <16 x i16> load. Then we have an extend of that load to an illegal type. This gets split into two parts: extract/extend the low half, then extract/extend the high half. If we optimized that correctly, it would come out to three instructions: ld1h, followed by uunpcklo/uunpckhi.

Whether that's the best approach probably depends on the target and the types involved. If extending vector loads are reasonably fast, maybe we just want to generate more of them.

bsmith accepted this revision.Aug 6 2021, 7:39 AM

Other than the clang-format nit, LGTM

This revision is now accepted and ready to land.Aug 6 2021, 7:39 AM
This revision was landed with ongoing or failed builds.Aug 12 2021, 2:43 AM
This revision was automatically updated to reflect the committed changes.