This is an archive of the discontinued LLVM Phabricator instance.

POC patch to demonstrate how new intrinsics for interleaved load/store could be used in LoopVectorize
AbandonedPublic

Authored by mgabka on Sep 22 2022, 6:49 AM.

Details

Summary

This patch needs to be broken down into smaller chunks.

The purpose of this patch is to demonstrate how new, generic LLVM intrinsics
could be emitted directly from inside LoopVectorize,
and later transformed into a target specific ones.
This POC patch focuses on the SVE side, however the intention is to not limit it
only to SVE. Code generation for NEON can also use this approach.

The motivation for such solution is that at the vectorization stage we already
know if we are handling interleaved memory accesses or not, so we can use
that knowledge to emit dedicated intrinsics for such accesses.

The current LLVM's implementation uses shufflevector to interleave
the data after/before load/store what is not suitable for scalable vectors.

Diff Detail

Build Status
Buildable 188169

Event Timeline

mgabka created this revision.Sep 22 2022, 6:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 6:49 AM
mgabka requested review of this revision.Sep 22 2022, 6:49 AM
Matt added a subscriber: Matt.Sep 22 2022, 1:46 PM
nlopes added inline comments.Sep 23 2022, 1:53 AM
llvm/lib/IR/IRBuilder.cpp
605

Please use PoisonValue here. We already changed all the other similar intrinsics to have poison as default passthru.

Interesting! This is a different approach than I'd been considering.

I had been picturing adding intrinsics for each of the shuffles used during interleave lowering. All three have reasonable lowerings on RISCV - not sure about SVE. Your approach would make matching segmented load/store easier, but is a larger divergence from the way we currently handle fixed length interleaves.

On the other hand, I'd been debating something similar to this approach for strided load/store intrinsics. Would you have some time to chat on a call and debate approaches here?

Would you have some time to chat on a call and debate approaches here?

A call will be very helpful, thanks. I'll get something set up.

reames requested changes to this revision.Feb 23 2023, 7:36 AM

FYI, D141924 has landed. This patch took a slightly different approach and introduced intrinsics for interleave and deinterleave respectively (not the combined memory ops). As such, this patch is now stale. There appear to be a few parts of this which are potentially salvageable (tests, vectorizer changes), but the patch either needs a significant rework and rebase or to be abandoned.

This revision now requires changes to proceed.Feb 23 2023, 7:36 AM

@reames yes I am aware of it, and I will either update this patch or create a new one (in that case will add the same set of reviewers).

mgabka requested review of this revision.Feb 28 2023, 1:47 AM
mgabka added a subscriber: luke.

Hi Luke Lau,
Hie is a POC patch I posted initially some time ago, however after landing https://reviews.llvm.org/D141924 it needs to be changed, I am working on a new patch which enables vectorization for interleaved accesses, with interleaving factor of 2. Will add you there as a reviewer when it is posted.

luke added a comment.Feb 28 2023, 2:25 AM

Hi Luke Lau,
Hie is a POC patch I posted initially some time ago, however after landing https://reviews.llvm.org/D141924 it needs to be changed, I am working on a new patch which enables vectorization for interleaved accesses, with interleaving factor of 2. Will add you there as a reviewer when it is posted.

Sounds good, thanks for taking care of this

mgabka abandoned this revision.Mar 2 2023, 7:25 AM

I posted a new patch https://reviews.llvm.org/D145163 updated to use the new intrinciscs