This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add patterns for add(udot(0, x, y), z) -> udot(z, x, y).
ClosedPublic

Authored by dmgreen on Feb 22 2021, 5:05 AM.

Details

Summary

Given a zero input for a udot, an add can be folded in to take the place of the input.

Diff Detail

Event Timeline

dmgreen created this revision.Feb 22 2021, 5:05 AM
dmgreen requested review of this revision.Feb 22 2021, 5:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2021, 5:05 AM
SjoerdMeijer added inline comments.Feb 22 2021, 5:33 AM
llvm/lib/Target/AArch64/AArch64InstrFormats.td
5626 ↗(On Diff #325419)

As you know I don't mind nice and concise little patterns, but was wondering if we don't expect this simplification to happen earlier?

fhahn added inline comments.Feb 22 2021, 1:26 PM
llvm/lib/Target/AArch64/AArch64InstrFormats.td
5626 ↗(On Diff #325419)

Not sure what the exact policy is, but InstCombinerImpl::visitCallInst( does optimize some target specific intrinsics. But I think this would be good to have for instruction selection in any case

dmgreen added inline comments.Feb 23 2021, 4:13 AM
llvm/lib/Target/AArch64/AArch64InstrFormats.td
5626 ↗(On Diff #325419)

Oh, you mean pre-ISel? We lower a vecreduce.add(v16i8 x) to a vecreduce(udot(zero, one, x)), so this needs to be done sometime during ISel lowering at least. I'll add some tests for it.

I can make it into a DAG combine. That should capture more cases without extra patterns, and should be simple enough I think.

dmgreen updated this revision to Diff 325739.Feb 23 2021, 5:44 AM

Convert to a DAGCombine, with some vecreduce tests.

fhahn accepted this revision.Feb 23 2021, 8:16 AM

LGTM, thanks

This revision is now accepted and ready to land.Feb 23 2021, 8:16 AM