This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Generate LD1 for anyext i8 or i16 vector load
ClosedPublic

Authored by asavonic on May 21 2021, 11:41 AM.

Details

Summary

The existing LD1 patterns do not cover cases where result type does
not match the memory type. This happens when illegal vector types are
extended and scalarized, for example:

load <2 x i16>* %v2i16

is lowered into:

// first element
(v4i32 (insert_subvector (v2i32 (scalar_to_vector (load anyext from i16)))))
// other elements
(v4i32 (insert_vector_elt (i32 (load anyext from i16)) idx))

Before this patch these patterns were compiled into LDR + INS.
Now they are compiled into LD1.

There is a separate patch to combine these sequences with an ADD
into an LD1_POST: https://reviews.llvm.org/D102939

The problem was reported in
PR24820: LLVM Generates abysmal code in simple situation.

Diff Detail

Event Timeline

asavonic created this revision.May 21 2021, 11:41 AM
asavonic requested review of this revision.May 21 2021, 11:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 21 2021, 11:41 AM
asavonic edited the summary of this revision. (Show Details)May 21 2021, 11:53 AM
asavonic edited the summary of this revision. (Show Details)

Sounds good. Can we make sure there is test coverage for big endian too.

asavonic updated this revision to Diff 347412.May 24 2021, 8:55 AM

Sounds good. Can we make sure there is test coverage for big endian too.

Thanks. Added BE checks to CodeGen/AArch64/aarch64-load-ext.ll.

asavonic edited the summary of this revision. (Show Details)May 24 2021, 9:05 AM
dmgreen accepted this revision.May 25 2021, 2:20 PM

Thanks. LGTM.

This revision is now accepted and ready to land.May 25 2021, 2:20 PM
This revision was landed with ongoing or failed builds.May 26 2021, 4:49 AM
This revision was automatically updated to reflect the committed changes.