This is an archive of the discontinued LLVM Phabricator instance.

[AArch64-SVE]: Force generating code compatible to streaming mode.
ClosedPublic

Authored by hassnaa-arm on Oct 10 2022, 2:02 AM.

Details

Summary

When streaming mode is enabled, lower some operations and disable some code paths to force generating code compatible to streaming mode.
Add testing files for shifts, build_vector, concat, extract_subvector, extract_vector_elt, and shuffle.

Diff Detail

Event Timeline

hassnaa-arm created this revision.Oct 10 2022, 2:02 AM
hassnaa-arm requested review of this revision.Oct 10 2022, 2:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2022, 2:02 AM
Matt added a subscriber: Matt.Oct 10 2022, 10:51 PM

Add additional test cases

get latest changes of parent revision

update by rebasing parent revision

Remove unrelated changes

Restore some changes removed by mistake

Update by parent branch

Update by parent patch

SjoerdMeijer added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11926

Drive by comment:

we don't need to pass Subtarget->forceStreamingCompatibleSVE() but can just query that inside useSVEForFixedLengthVectorVT?

hassnaa-arm added inline comments.Oct 14 2022, 4:27 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11926

I think we still need to pass it, because useSVEForFixedLengthVectorVT is used in many places, some of them don't need the flag to be enabled.
The flag is used for enabling lowering specific nodes that cause generating invalid code in streaming mode.

Update by changes of parent patch.
While lowering ISD::load remove enabling LowerFixedLengthVectorLoadToSVE,
no need for it, as zero_Extend is custom-lowered.
Previously, LowerLOAD() creates zero_Extend node, which cause invalid generated code,
but now, the zero_extend node is custom-lowered, which cause valid generated code.

Update by parent patch

hassnaa-arm retitled this revision from [AArch64-SVE]: Force generating code compatible to streaming mode. to [AArch64-SVE]: Force generating code compatible to streaming mode for (masked/extending/truncating) load/store.Oct 20 2022, 5:01 AM

Hi @hassnaa-arm, I think you have the tests the wrong way around. The tests from D136147 should be part of this patch, because this is the patch where you're implementing the lowering of the operations you're testing in D136147.
After this patch, you get the masked/truncating/extending load/store operations "for free", so the tests for those operations could be moved to a separate test-only patch like D136147.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1399

It should be able to use standard scalar instructions for v1i64 in streaming-compatible mode, so this one can be removed from the list.

1402

Most scalar FP operations are valid in streaming mode, so we probably don't need to do anything custom for this type.

1614–1616

nit: Perhaps it doesn't lead to an error, but these operations only operate on integers, so should be guarded by:

if (VT.isInteger()) { ... }
12092–12098

nit: rather than wrapping this in another condition, can you just add it to the existing condition with && !Subtarget->forceStreamingCompatibleSVE() ?

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
3035–3046

nit: Can you change this into:

let Predicates = [NotInStreamingSVEMode], AddedComplexity = 1 in {
def : Pat<...>
..
}

let Predicates = [NotInStreamingSVEMode] in {
def : Pat<..>
...
}

Rather than indenting?

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-masked-store.ll
313

Can you remove all tests that are larger than "twice the size" of a 128bit vector (v32f32 is 8x the size, I'm not sure what value that adds for the testing of this functionality)

hassnaa-arm marked 6 inline comments as done.Oct 21 2022, 5:33 AM

Remove all tests that are larger than "twice the size" of a 128bit vector.

Remove masked/truncating/extending load/store, to be added in a test-only patch.

hassnaa-arm retitled this revision from [AArch64-SVE]: Force generating code compatible to streaming mode for (masked/extending/truncating) load/store to [AArch64-SVE]: Force generating code compatible to streaming mode..Oct 21 2022, 6:26 AM
hassnaa-arm edited the summary of this revision. (Show Details)

Lower And operation, and disable replacing 'and' by 'bic' while combining step.

Update by parent patch

Revert changes added by mistake

Add testing files for shifts, build_vector, concat, extract_subvector, extract_vector_elt, and shuffle.

hassnaa-arm edited the summary of this revision. (Show Details)Oct 24 2022, 2:59 AM
hassnaa-arm edited the summary of this revision. (Show Details)

Hi @hassnaa-arm, I think it's almost there! Most of the tests look good to me. I just had a few minor comments ...

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11199

I don't think you need to pass in the Subtarget here. In the code below you can just do

if (VT.isFixedLengthVector() &&
    DAG.getSubtarget<AArch64Subtarget>().forceStreamingCompatibleSVE()
  return SDValue();
11245

Same comment as above for tryAdvSIMDModImm32

13814

nit: The comment can probably be formatted better I think so that you use up 80 chars, i.e.:

// Skip if streaming compatible SVE is enabled, because it generates invalid
// code in streaming mode when SVE length is not specified.
15529

Again, you can avoid passing in the subtarget here if you make the changes to tryAdvSIMDModImm32 and tryAdvSIMDModImm16.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
3034–3035

When we guard something by a predicate we normally add a comment on the final brace '}' to make it easy to see, i.e. something like:

} // End NotInStreamingSVEMode
3056–3059
} // End NotInStreamingSVEMode
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-concat.ll
10 ↗(On Diff #470094)

I don't think we need to have vscale_range(2,0) on these tests, right? We want streaming SVE to work for vector lengths.

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-extract-vector-elt.ll
17 ↗(On Diff #470094)

nit: I think in all these tests there should only be 2 spaces at the start of the IR, i.e.

%r = extractelement <2 x half> %op1, i64 1

etc.

hassnaa-arm marked 8 inline comments as done.

Remove vscale_range from concat.ll test file.

Fix identation

david-arm accepted this revision.Oct 27 2022, 11:18 AM

LGTM! Thanks for making the changes @hassnaa-arm.

This revision is now accepted and ready to land.Oct 27 2022, 11:18 AM
paulwalker-arm added inline comments.Oct 27 2022, 4:28 PM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22084–22085

As with the above change can this be V.getValueType().isFixedLengthVector() && isTypeLegal(V.getValueType()) &&?

llvm/test/CodeGen/AArch64/sve-streaming-fixed-length-int-shifts.ll
1–5 ↗(On Diff #471171)

Please rename this file sve-streaming-mode-fixed-length-int-shifts.ll to match the same format as the others.

hassnaa-arm marked 2 inline comments as done.Oct 28 2022, 3:57 AM

Rename sve-streaming-fixed-length-int-shifts.ll to sve-streaming-mode-fixed-length-int-shifts.ll

paulwalker-arm accepted this revision.Oct 28 2022, 4:04 AM
sdesmalen accepted this revision.Oct 28 2022, 4:13 AM

LGTM as well (please address nit before submitting)

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22084

nit: Does this cross the 80-character limit? (please use clang-format to be sure)

hassnaa-arm marked an inline comment as done.Oct 28 2022, 4:41 AM
sdesmalen added inline comments.Oct 28 2022, 8:01 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1620

nit: I only just spot this now in one of your other patches, but ISD::AND should also be guarded by VT.isInteger().

paulwalker-arm added inline comments.Oct 28 2022, 8:08 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1620

Although not wrong it doesn't really matter as legalisation is smart enough to not care about the operation action for types that make no sense. We rely on this in addTypeForFixedLengthSVE where the type is only considered when handling extend-loads/truncating-store plus the odd compare.

sdesmalen added inline comments.Oct 28 2022, 8:12 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1620

In that case it's probably better to remove the condition entirely.

hassnaa-arm marked an inline comment as done.Oct 28 2022, 10:22 AM
This revision was landed with ongoing or failed builds.Oct 31 2022, 4:03 AM
This revision was automatically updated to reflect the committed changes.