This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorize] Use available masked vector functions when required
ClosedPublic

Authored by huntergr on Oct 19 2022, 5:34 AM.

Details

Summary

This patch builds on D132458 (use masked vector functions with a dummy mask when no unmasked variant is available) and D134422 (scalarize function calls when masking is required), adding support for actually using the masked functions when masking is required, either due to the control flow in the scalar loop or because we're performing tail folding.

Diff Detail

Event Timeline

huntergr created this revision.Oct 19 2022, 5:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 19 2022, 5:34 AM
huntergr requested review of this revision.Oct 19 2022, 5:34 AM
huntergr updated this revision to Diff 477898.Nov 25 2022, 2:31 AM

Rebased on top of the latest version of D132458, which incorporates the vector function into the vplan recipe instead of looking it up again.

tschuett added inline comments.
llvm/include/llvm/Analysis/VectorUtils.h
291
if (Info.isMasked())
  return true;


return false;
tschuett added inline comments.Nov 25 2022, 3:00 AM
llvm/include/llvm/Analysis/VectorUtils.h
291
  if (!VF || Info.Shape.VF == *VF)
    if (Info.isMasked())


return false;
huntergr updated this revision to Diff 477908.Nov 25 2022, 3:50 AM

Tidied up hasMaskedVariant as requested.

huntergr marked 2 inline comments as done.Nov 25 2022, 3:51 AM
Matt added a subscriber: Matt.Feb 2 2023, 1:15 PM
david-arm added inline comments.Feb 22 2023, 3:44 AM
llvm/include/llvm/Analysis/VectorUtils.h
285

I wonder if it's useful to have a VFDatabase::getMaskedMappings variant or something like that? That way we can avoid walking through all the mappings if there aren't going to be any masked variants anyway. I was thinking about something like this:

auto Mappings = VFDatabase::getMaskedMappings(CI);
if (Mappings.emtpy())
  return false;
for (VFInfo Info : Mappings)
  if (!VF || Info.Shape.VF == *VF)
    return true;
return false;

Do you think it's worth it?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
3435

Looks like this is based off an old version of D132458. Could you rebase please?

3469–3471

I think it's worth caching the return value of Legal->isMaskRequired(CI) and reusing here, since it's used 3 times and it saves the compiler having to rewalk the MaskedOp list in LoopVectorizationLegality

8341

Just out of interest why do we need to remove const?

8399

Perhaps it's worth adding some comments here explaining what the two scenarios are, i.e. 1) The blocks needs predication so we have to use the mask for that block, or 2) the block doesn't need predication but we are using a masked variant so need to synthesize an all-true mask.

huntergr updated this revision to Diff 501132.Feb 28 2023, 7:29 AM
huntergr marked 2 inline comments as done.

Rebased, made requested changes.

huntergr marked an inline comment as done.Feb 28 2023, 7:30 AM
huntergr added inline comments.
llvm/include/llvm/Analysis/VectorUtils.h
285

getMaskedMappings would just end up iterating over all the variants anyway -- this code doesn't really do any caching, since it's a static method rather than an instance method. So every time we ask for variants we re-decode the mangled names to figure out which functions we should be looking at.

I think LoopVectorize and VectorUtils could be much smarter about this, but that's something for another patch.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
8341

'createBlockInMask' is not a const function and takes a VPlan as an argument, so this can't be const either.

david-arm added inline comments.Mar 24 2023, 7:01 AM
llvm/include/llvm/Analysis/VectorUtils.h
285

Sure, I was just thinking we could avoid walking over the mappings twice like we're doing at the moment. With your patch as it stands we call VFDatabase::getMappings, which walks over all the mappings and creates a set of them, which we then iterate over yet again here. However, if you had a function such as VFDatabase::getMaskedMappings you can at least optimise for the (common) case where there are no masked variants, i.e.

auto Mappings = VFDatabase::getMaskedMappings(CI);
if (Mappings.emtpy())
  return false;

That only requires walking over the mappings once, and if there are masked mappings it also makes the second list walk shorter because you've already filtered out the non-masked variants.

Anyway, I wouldn't hold the patch up for this - I just thought it seemed like it might reduce computational effort.

huntergr added inline comments.Mar 29 2023, 3:29 AM
llvm/include/llvm/Analysis/VectorUtils.h
285

I agree it would be nice, which is why I'm planning another patch to refactor the use of VFDatabase in LV. My basic idea is to establish an initial mapping between callinsts and vector variants during legality checking so that we only look up valid variants for callinsts in LV itself.

I'd then like to create a mapping for CallInst+VF to Cost+WideningDecision, similar to what's done for memory operations now. This should greatly reduce the amount of redundant lookups -- it also looks through VFDatabase each time it requests a cost for any VF.

david-arm accepted this revision.Mar 29 2023, 6:07 AM

LGTM! Thanks for all the good work on this so far @huntergr. :)

This revision is now accepted and ready to land.Mar 29 2023, 6:07 AM
This revision was landed with ongoing or failed builds.Apr 5 2023, 3:19 AM
This revision was automatically updated to reflect the committed changes.