This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][DAGCombiner]: combine <2xi64> mul add/sub.
ClosedPublic

Authored by hassnaa-arm on Apr 12 2023, 4:57 AM.

Details

Summary

64-bit vector mul is not supported in NEON,
so we use the SVE's mul.
To improve the performance, we can go one step further,
and use SVE's add/sub, so that we can use SVE's mla/mls.
That works on these patterns:
This works on the patterns of:
add v1, (mul v2, v3)
// sub v1, (mul v2, v3)

Diff Detail

Event Timeline

hassnaa-arm created this revision.Apr 12 2023, 4:57 AM
hassnaa-arm requested review of this revision.Apr 12 2023, 4:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2023, 4:57 AM
david-arm accepted this revision.Apr 12 2023, 9:17 AM

LGTM! Can you fix the formatting issue before landing the patch please? Thanks!

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
17846

nit: I think there is a formatting issue here and should be if (N->getOpcode() == ISD::SUB)

This revision is now accepted and ready to land.Apr 12 2023, 9:17 AM

Hi @paulwalker-arm Are you okay with landing this patch ?

Hi @paulwalker-arm Are you okay with landing this patch ?

Hi @hassnaa-arm , this dropped off my radar since Dave already accepted the updated patch. I'll try to take a look later today but please don't feel you have to wait for me before landing the patch.

paulwalker-arm added inline comments.Apr 19 2023, 5:07 PM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
17836

Given the MUL_PRED node you care about is the result of operation legalisation it's perhaps worth having a !isBeforeLegalizeOps() or isAfterLegalizeDAG() check so the combine costs less compile time during the earlier code generation phases. It'll also mean you can be sure the types are legal just in case we get here via some weird route.

17845–17855

I think you'll cover more cases if you check N->getOperand(1) first. For example, consider the case when you have sub(extract(), extract(mul_pred())), with the current order (i.e. checking Op0 first) the combine will not fire?

I guess there's an argument it would be even better for the function to take the operands as parameters (or implement a lambda function) so you'd have something like:

if (res = performOpt(Opcode, Op0, Op1)
  return res;
if (Opcode == ISD::ADD)
  if (res = performOpt(Opcode, Op1, Op0)
    return res;

This way you'll cover the maximum number of combinations.

I'll leave the decision whether it's worth covering all cases to you but as a minimum I'd switch to check Op1 first as mention above.

hassnaa-arm marked 3 inline comments as done.Apr 19 2023, 11:57 PM

Enhance code readability.

david-arm added inline comments.Apr 20 2023, 6:37 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
17874

Sorry to be a pain @hassnaa-arm, but this generates warnings about missing brackets () around the res = performOpt... code. I think you can silence the warnings by rewriting this as:

if (SDValue res = performOpt(N->getOperand(0), N->getOperand(1)))
  return res;
else if (N->getOpcode() == ISD::ADD)
  return performOpt(N->getOperand(1), N->getOperand(0));

return SDValue();
hassnaa-arm marked an inline comment as done.Apr 20 2023, 10:23 PM

Fix warning.

paulwalker-arm accepted this revision.Apr 21 2023, 6:51 AM

Looks good to me but please add a couple of tests to show the new operand flexibility before landing the patch. For example:

define <2 x i64> @test(<2 x i64> %a, <2 x i64> %b, <2 x i64> %c, <2 x i64> %d) {
  %mul1 = mul <2 x i64> %a, %b
  %mul2= mul <2 x i64> %c, %d
  %sub = sub <2 x i64> %mul, %mul2
  ret <2 x i64> %sub
}

Add extra test cases that have 'extract_subvector' in both operands.

This revision was landed with ongoing or failed builds.Apr 24 2023, 4:55 AM
This revision was automatically updated to reflect the committed changes.