This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Adjust costs of i1 and/or/xor reductions
ClosedPublic

Authored by dmgreen on May 23 2023, 12:48 AM.

Details

Summary

This expands the reduction cost of i1 and/or/xor, so that larger type sizes get handled by the existing code. For i1 reductions, and will use maxv, or will use minv and xor will use addv, plus the cost of legalizing the type for larger vectors using and/or/xor. The i1 vectors will be legalized to higher width integers (say v16i8), which this overrides the cost of. As with all i1 vectors there is a chance that the types the i1 vector is created with and how it is used will not match, introducing extra extends that are not necessarily costmodelled.
https://godbolt.org/z/6Gc9K6b7T

Diff Detail

Event Timeline

dmgreen created this revision.May 23 2023, 12:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2023, 12:48 AM
dmgreen requested review of this revision.May 23 2023, 12:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2023, 12:48 AM
david-arm added inline comments.
llvm/test/Analysis/CostModel/AArch64/reduce-xor.ll
20

Interestingly, we can also do much better for xor reductions like v16i8, v8i16, etc. by using SVE if available too. For a v8i16 xor reduction we can just do:

ptrue p0.h, vl8
eorv h0, p0, z0.h
fmov w0, s0

whereas I see we currently do

ext     v1.16b, v0.16b, v0.16b, #8
eor     v0.8b, v0.8b, v1.8b
fmov    x8, d0
eor     x8, x8, x8, lsr #32
lsr     x9, x8, #16
eor     w0, w8, w9

ping

llvm/test/Analysis/CostModel/AArch64/reduce-xor.ll
20

OK cool.

samtebbs accepted this revision.May 31 2023, 8:54 AM
This revision is now accepted and ready to land.May 31 2023, 8:54 AM
This revision was landed with ongoing or failed builds.Jun 1 2023, 1:28 AM
This revision was automatically updated to reflect the committed changes.