User Details
- User Since
- Jul 10 2019, 8:51 AM (80 w, 3 d)
Fri, Jan 22
Wed, Jan 13
Tue, Jan 12
Mon, Jan 11
Fri, Jan 8
Removed the isVectorUnpack helper added in the previous revision. If the index values are already extended to i64 by an unpkhi/lo, then the gather does not also need to extend the index.
This affects the masked_gather_nxv4f64 test, which has been updated as follows:
Thu, Jan 7
- Added a new helper function, isVectorUnpack
- Added tests which load <vscale x 4 x half> & <vscale x 2 x float>
Wed, Jan 6
Dec 18 2020
Dec 17 2020
Dec 16 2020
- Improve codegen where the splat value is a constant, but out of range for the immediate addressing mode, e.g.
mov x8, xzr add z1.d, z1.d, #32 // =0x20 st1b { z0.d }, p0, [x8, z1.d] ret
->
mov w8, #32 st1b { z0.d }, p0, [x8, z1.d] ret
- Refactored selectGatherScatterAddrMode based on the suggestions from @sdesmalen
- Added bfloat tests to the new test files added by this patch
- Removed unused %offset from sve-masked-gather.ll and removed duplicate tests from sve-masked-scatter-legalise.ll
- Added patterns for bfloat16 extract_subvector to AArch64SVEInstrInfo.td
Dec 15 2020
Dec 14 2020
Thanks @cameron.mcinally & @paulwalker-arm for reviewing this patch!
Dec 11 2020
- Reordered the condition added to LowerReductionToSVE which sets RdxVT
Dec 10 2020
Dec 9 2020
Dec 7 2020
- Renamed ResNeedsExtend -> ResNeedsSignExtend
- Added a test to sve-masked-gather-legalize.ll for a zero-extended gather load with multiple uses
Dec 3 2020
Dec 2 2020
- Moved changes to set the default IndexType to SIGNED_UNSCALED to D91092
- Remove any unnecessary sign/zero extensions of the Index in LowerMGATHER
- Removed Index = Index.getOperand(0) from LowerMGATHER, which was incorrectly removing the sign/zero extend of Index if getGatherScatterIndexIsExtended returns true
Nov 30 2020
- Changed the default IndexType set by visitMaskedGather to ISD::SIGNED_UNSCALED
- Reordered this to depend on D91092, as this more clearly demonstrates the benefit of the changes in this patch to the sve-masked-gather* tests
- I mistakenly created this patch to depend on D92319, this revision removes this dependency
- Moved the addition of the -aarch64-enable-mgather-combine option to D91092 in order to more clearly demonstrate the value added by the combines in this patch.
Nov 27 2020
Added an option to disable the existing combines in AArch64ISelLowering for s/zext_masked_gathers (performSignExtendInRegCombine & performSVEAndCombine)
Hi @sdesmalen, I've updated the series as suggested and the patches are now in the following order:
D91092 - Lower scalable masked scatters (with references to ExtensionType in LowerMGATHER removed)
D91084 - Add the ExtensionType flag to MGATHER
D92230 - DAG combines for z/sext of a masked gather, adding ExtensionType back into LowerMGATHER
Removed isConstantSplatVectorMaskForType() & the DAG combines for s/ext_masked_gather (these will be added in a follow up patch)
Reordering the scalable masked gather patch series that this is a part of, as suggested on D91084. This is now the first patch in the series.
Nov 26 2020
Nov 25 2020
- Replaced calls to TLI.isLoadExtLegal with TLI.isVectorLoadExtDesirable in the masked gather combines, which also checks the type of load operation being used
- Removed unnecessary !LegalOperations from the zext_masked_gather combine and added a check for hasOneUse(), matching the sext_masked_gather combine
Nov 23 2020
Nov 13 2020
- Removed the EltSize < SplatVal.getBitWidth() check from isConstantSplatVector and used truncOrSelf instead
Nov 12 2020
- Added the getExtendedGatherOpcode() function, which returns a signed gather load opcode (e.g GLD1_MERGE_ZERO -> GLD1S_MERGE_ZERO)
- Get the extension type of the gather load in LowerMGATHER and use the signed gather opcode returned by getExtendedGatherOpcode() if the extension type is EXTLOAD/SEXTLOAD,
Added DAG combines for the following:
- fold (and (masked_gather x)) -> (zext_masked_gather x)
- fold (sext_inreg (masked_gather x)) -> (sext_masked_gather x)
Nov 11 2020
Nov 10 2020
@craig.topper After changing the refineUniformBase function in D90942 to use getSplatVector() when trying to get the base pointer from a splat, the output of one of the X86 masked gather tests in masked_gather_scatter.ll has changed ("test14", where the base pointer is not a splat, so getUniformBase returns false).
Could you please take a look over this to make sure the changes are correct in this circumstance?
- Moved changes to SplitVecRes_MGATHER & SplitVecOp_MGATHER into this patch from D91092
- Replaced check for ISD::SPLAT_VECTOR opcode in refineUniformBase with SelectionDAG::getSplatVector()
- Applied changes suggested by clang-format
- Moved SplitVecOp_MSCATTER changes to the parent patch, D90939
- Applied changes suggested by clang-format
- Made IsTrunc the last parameter to the MaskedScatterSDNode constructor
- Moved changes to SplitVecOp_MSCATTER, which split MemVT before creating the 'Hi' & 'Lo' scatters, into this patch from D90941
Nov 9 2020
Nov 6 2020
Abandoning this revision as it has been split into several smaller patches:
Nov 4 2020
Thanks for reviewing this, @paulwalker-arm!
Nov 3 2020
- Changed isLegalMaskedGSIndexType to return true if the instruction can perform the sign extend (i.e. if the index is extended from i32 and the number of elements is at least 4)
- Added getCanonicalIndexType to convert redundant addressing modes (e.g. scaling is redundant when accessing bytes)
- Added a target-specific DAG combine for mscatter to promote indices smaller than i32
- Added various tests for type legalisation in sve-masked-scatter-legalise.ll
Oct 28 2020
Changes to LowerPredReductionToSVE:
- Fixed if statement which should be using ||
- Removed unnecessary And when lowering VECREDUCE_XOR
- Wrapped case blocks in {} where necessary
Oct 26 2020
- Changed fix for the warning in computeKnownBits for extract_vector_elt to match D87651
- Moved check for i1 types in LowerVECREDUCE outside of the switch statement
- Added isLegalMaskedGSIndexType to query if the index type is legal for masked scatters (returning true for nxv2i32, nxv4i32 & nxv2i64 on AArch64)
Oct 21 2020
- Changed getNode() to check if the operand type of vecreduce min/max is i1 instead of the result type
- Fixed a mistake with the changes to getNode() in the previous patch, where the transformations of [s|u]min & [s|u]max would also apply to other operations
- Moved the transformations of i1 [s|u]min & [s|u]max -> and/or to SelectionDAG::getNode()
- Removed custom lowering of vecreduce_[s|u]min & vecreduce_[s|u]max for predicate types