This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorize] Vectorize the compact pattern
Needs ReviewPublic

Authored by TiehuZhang on Aug 25 2023, 4:27 AM.

Details

Summary

This patch tries to vectorize the compact pattern, as shown,

for(i=0; i<N; i++){
   x = comp[i];
   if(x<a) Out_ref[n++]=B[i];
}

It introduces some changes:

  1. Add a pattern matching in LoopVectorizationLegality to cache specific cases.
  2. Introduce two new recipes to hande the compact chain: VPCompactPHIRecipe: Handle the entry PHI of compact chain. VPWidenCompactInstructionRecipe: Handle other instructions in compact chain.
  3. Slightly adapt the cost model for compact pattern.

Diff Detail

Event Timeline

TiehuZhang created this revision.Aug 25 2023, 4:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 25 2023, 4:27 AM
TiehuZhang requested review of this revision.Aug 25 2023, 4:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 25 2023, 4:27 AM
mdchen added a subscriber: mdchen.Aug 25 2023, 7:48 PM
mdchen added inline comments.
llvm/lib/Analysis/TargetTransformInfo.cpp
1240

Overlapped with isLegalMaskedCompressStore?

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
28

Deleted by mistake?

llvm/lib/Transforms/Utils/LoopUtils.cpp
1134

This part is aarch64-specific and should be hidden in the mid-end and implemented in the AArch64 backend.

1141

CNTP is aarch64 backend mnemonics and lacks generality, something like createTargetCountVectorActiveElements would be better. COMPACT has the same issue but to me it's more acceptable.

Hi, @mdchen, I'll fix them in the near future, thanks!

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
28

Thanks for the reminder, I'll fix it later!

llvm/lib/Transforms/Utils/LoopUtils.cpp
1134

You are right, we should hidden specific backend implementation in mid-end. I add a hook for intrinsic ID here, maybe it's not enough. Actually, I'm not sure how to implement it gracefully, as there may be different intrinsics in different backends (e.g, different number of parameters as compact and compress). Should I add a new intrinsic for them and than lower to different intrinsics, or add hooks and directly return the specific intrinsic ?

Hi, @xbolva00, I didn't see that issue, I'll reply it if the patch is accepted, thank!