This patch builds on D132458 (use masked vector functions with a dummy mask when no unmasked variant is available) and D134422 (scalarize function calls when masking is required), adding support for actually using the masked functions when masking is required, either due to the control flow in the scalar loop or because we're performing tail folding.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Rebased on top of the latest version of D132458, which incorporates the vector function into the vplan recipe instead of looking it up again.
llvm/include/llvm/Analysis/VectorUtils.h | ||
---|---|---|
288 | if (Info.isMasked()) return true; return false; |
llvm/include/llvm/Analysis/VectorUtils.h | ||
---|---|---|
288 | if (!VF || Info.Shape.VF == *VF) if (Info.isMasked()) return false; |
llvm/include/llvm/Analysis/VectorUtils.h | ||
---|---|---|
282 | I wonder if it's useful to have a VFDatabase::getMaskedMappings variant or something like that? That way we can avoid walking through all the mappings if there aren't going to be any masked variants anyway. I was thinking about something like this: auto Mappings = VFDatabase::getMaskedMappings(CI); if (Mappings.emtpy()) return false; for (VFInfo Info : Mappings) if (!VF || Info.Shape.VF == *VF) return true; return false; Do you think it's worth it? | |
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
3457 | Looks like this is based off an old version of D132458. Could you rebase please? | |
3492–3494 | I think it's worth caching the return value of Legal->isMaskRequired(CI) and reusing here, since it's used 3 times and it saves the compiler having to rewalk the MaskedOp list in LoopVectorizationLegality | |
8366 | Just out of interest why do we need to remove const? | |
8434 | Perhaps it's worth adding some comments here explaining what the two scenarios are, i.e. 1) The blocks needs predication so we have to use the mask for that block, or 2) the block doesn't need predication but we are using a masked variant so need to synthesize an all-true mask. |
llvm/include/llvm/Analysis/VectorUtils.h | ||
---|---|---|
282 | getMaskedMappings would just end up iterating over all the variants anyway -- this code doesn't really do any caching, since it's a static method rather than an instance method. So every time we ask for variants we re-decode the mangled names to figure out which functions we should be looking at. I think LoopVectorize and VectorUtils could be much smarter about this, but that's something for another patch. | |
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
8366 | 'createBlockInMask' is not a const function and takes a VPlan as an argument, so this can't be const either. |
llvm/include/llvm/Analysis/VectorUtils.h | ||
---|---|---|
282 | Sure, I was just thinking we could avoid walking over the mappings twice like we're doing at the moment. With your patch as it stands we call VFDatabase::getMappings, which walks over all the mappings and creates a set of them, which we then iterate over yet again here. However, if you had a function such as VFDatabase::getMaskedMappings you can at least optimise for the (common) case where there are no masked variants, i.e. auto Mappings = VFDatabase::getMaskedMappings(CI); if (Mappings.emtpy()) return false; That only requires walking over the mappings once, and if there are masked mappings it also makes the second list walk shorter because you've already filtered out the non-masked variants. Anyway, I wouldn't hold the patch up for this - I just thought it seemed like it might reduce computational effort. |
llvm/include/llvm/Analysis/VectorUtils.h | ||
---|---|---|
282 | I agree it would be nice, which is why I'm planning another patch to refactor the use of VFDatabase in LV. My basic idea is to establish an initial mapping between callinsts and vector variants during legality checking so that we only look up valid variants for callinsts in LV itself. I'd then like to create a mapping for CallInst+VF to Cost+WideningDecision, similar to what's done for memory operations now. This should greatly reduce the amount of redundant lookups -- it also looks through VFDatabase each time it requests a cost for any VF. |
I wonder if it's useful to have a VFDatabase::getMaskedMappings variant or something like that? That way we can avoid walking through all the mappings if there aren't going to be any masked variants anyway. I was thinking about something like this:
Do you think it's worth it?