Accord the discussion in D122281, we missing an ISD::AND combine for MLOAD
because it relies on BuildVectorSDNode is fails for scalable vectors.
This patch is intend to handle that, so we can circle back the type MVT::nxv2i32
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14425–14426 | The masked load's MemVT is not sufficient as you must also consider the load's extension type.
How far down this path you go depends on the exact use case you need. If it's just (1) then I think this can be a common DAGCombine. For (2) and (3) it'll depend on whether there are sufficient TLI hooks to ensure you don't create a version a target cannot lower. That said, given you require the input to be an extending masked load perhaps that is not a problem. One final issue is the handling of the masked load's pass through value. With the AND in place this will be zero-extended but once removed that stops and thus that will need to be considered as well. My guess is that for today's use case you want to restrict the combine to instances where the pass through value is either undef or zero. |
With some debug with function DAGCombiner::visitAND, I find there is another issue need to check:
it use TLI.isLoadExtLegal(ISD::ZEXTLOAD, ExtVT, LoadVT) to check whether the following transform can be done, do we need similar check in the AArch64 backend? If yes, I think we must set MVT::nxv2i32 legal.
fold (and (masked_load) (build_vec (x, ...))) to zext_masked_load
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14425–14426 | Thanks @paulwalker-arm for detail comment. |
Not sure I understand what you mean here. ExtVT will be the legal type, most likely MVT::nxv2i64, whereas MVT::nxv2i32 will be the MemVT that doesn't need to be legal, so I don't see an issue? Looking at DAGCombiner::visitAND it looks like it already does a related transformation but because it relies on BuildVectorSDNode is fails for scalable vectors. To me it looks like we can make this common code more portable and support all vector types.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14425–14426 | The need to check the extension type depends on which of the options you go with. If you only implement (1) then you need to check the extension type is explicitly zero-extending. Options (1) & (2) require you to skip sign-extending cases and for (3) you'll still need a use count check for the sign-extending case. |
I rewrite the MR with above review with support splat_vec in DAGCombiner::visitAND, thanks @paulwalker-arm very much for detail advice.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
6070 | If you follow the advice below I'd also remove the build_vec comment as the splat_vec one is meaning full enough in itself. | |
6072 | You could simplify the code by extract the splat here, i.e. ConstantSDNode *Splat = isConstOrConstSplat(N1, true, true) That would remove the separate buildvector and splatvector checks and also remove a level of indentation in the TLI.isLoadExtLegal block. |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
6070 | Sorry I think I wasn't clear enough with my previous comment. I meant that it is not worth having both the original build_vec and your new splat_vec comment and so it would be better to just keep the splat_vec one (i.e. fold (and (masked_load) (splat_vec (x, ...))) to zext_masked_load) |
Sorry I think I wasn't clear enough with my previous comment. I meant that it is not worth having both the original build_vec and your new splat_vec comment and so it would be better to just keep the splat_vec one (i.e. fold (and (masked_load) (splat_vec (x, ...))) to zext_masked_load)