This is an archive of the discontinued LLVM Phabricator instance.

[ARM,MVE] Add intrinsics for the VQDMLAD family.
ClosedPublic

Authored by simon_tatham on Mar 18 2020, 6:31 AM.

Details

Summary

This is another set of instructions too complicated to be sensibly
expressed in IR by anything short of a target-specific intrinsic.
Given input vectors a,b, the instruction generates intermediate values
2*(a[0]*b[0]+a[1]+b[1]), 2*(a[2]*b[2]+a[3]+b[3]), etc; takes the high
half of each double-width values, and overwrites half the lanes in the
output vector c, which you therefore have to provide the input value
of. Optionally you can swap the elements of b so that the are things
like a[0]*b[1]+a[1]*b[0]; optionally you can round to nearest when
taking the high half; and optionally you can take the difference
rather than sum of the two products. Finally, saturation is applied
when converting back to a single-width vector lane.

Diff Detail

Event Timeline

simon_tatham created this revision.Mar 18 2020, 6:31 AM
miyuki accepted this revision.Mar 18 2020, 8:58 AM

LGTM

This revision is now accepted and ready to land.Mar 18 2020, 8:58 AM
This revision was automatically updated to reflect the committed changes.