This patch extends existing IR combines for: fmul, fsub and fadd, 
relying on all active predicate to also apply to their equivalent
undef (_u) intrinsics.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
| llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
|---|---|---|
| 1702 | Hi Jolanta, if (auto Replacement = instCombineSVEVectorBinOp(IC, II)) return Replacement; return instCombineSVEVectorFuseMulAddSub would be more readable here. | |
This patch for now covers only fadd/fadd_u, fsub/fsub_u and fmul/fmul_u, what about other intrinsics which are having both merging and _u variants I thought we want to replace all merging forms with _u forms when the predicate is all true ? @paulwalker-arm what is your opinion here?
| llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-fmul_u-idempotency.ll | ||
|---|---|---|
| 3 | the test in this file do not match the description of the commit message, I think that is an optimization we want to cover separately, @paulwalker-arm wjhat is your opinion? | |
Implemented IR combines to convert _m intrinsics that take an all active predicate to their equivalent _u intrinsic.
This patch is concerned with extending existing combines rather than adding new ones. I do agree the summary should be better worded to reflect this.
| llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
|---|---|---|
| 1289–1297 | Is it possible to maintain the original structure of this function by just adding the new instCombineSVEVectorFuseMulAddSub entry for aarch64_sve_fmul_u? e.g. if (auto FMLA_U =
        instCombineSVEVectorFuseMulAddSub<Intrinsic::aarch64_sve_fmul_u,
                                          Intrinsic::aarch64_sve_fmla_u>(
            IC, II, true))
  return FMLA_U;
return instCombineSVEVectorBinOp(IC, II); | |
| 1315–1323 | Same comment as above relating to maintain the function original structure. | |
| 1326 | I'd rather the instCombineSVEAllActive2VA and instCombineSVEAllActive3VA changes be rolled into a separate patch. That way we have this patch focusing on the existing combines that are being ported to include the newer intrinsics and then a following patch to add new combines for everything else. | |
| llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-fma-binops.ll | ||
| 199 | Do you mind moving this negative test so it appears after the three positive ones? | |
| llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-fmul_u-idempotency.ll | ||
| 3 | @mgabka: This patch is about extending existing combines to also cover their equivalent _u intrinsics. The majority relate to replacing intrinsic calls with IR instructions but instCombineSVEVectorMul contains more combines. I think it would be messier to try to artificially avoid these combines so am happy for this patch to include them, assuming they're genuinely applicable? with this patch adding the relevant tests. | |
| llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
|---|---|---|
| 1289–1297 | Yes, there was no fallout from the move. It looks much better now. | |
| 1315–1323 | Fixed. | |
| 1326 | Removed IR combines from this patch. | |
| llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-fmul_u-idempotency.ll | ||
| 3 | It looks like this test and sve-intrinsic-fma-binops.ll cover all paths taken in instCombineSVEVectorMul. | |
| llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-fma-binops.ll | ||
|---|---|---|
| 199 | Fixed. | |
Is it possible to maintain the original structure of this function by just adding the new instCombineSVEVectorFuseMulAddSub entry for aarch64_sve_fmul_u? e.g.
if (auto FMLA_U = instCombineSVEVectorFuseMulAddSub<Intrinsic::aarch64_sve_fmul_u, Intrinsic::aarch64_sve_fmla_u>( IC, II, true)) return FMLA_U; return instCombineSVEVectorBinOp(IC, II);