This is an archive of the discontinued LLVM Phabricator instance.

[mlir] [VectorOps] Replace zero fma with mult for vector.contract
ClosedPublic

Authored by aartbik on Jun 29 2020, 8:19 PM.

Details

Summary

More efficient implementation of the multiply-reduce pair,
no need to add in a zero vector. Microbenchmarking on AVX2
yields the following difference in vector.contract speedup
(over strict-order scalar reduction).

SPEEDUP SIMD-fma SIMD-mul
4x4 1.45 2.00
8x8 1.40 1.90
32x32 5.32 5.80

Diff Detail

Event Timeline

aartbik created this revision.Jun 29 2020, 8:19 PM
aartbik edited the summary of this revision. (Show Details)Jun 29 2020, 8:20 PM
aartbik added reviewers: reidtatge, ftynse, mehdi_amini.
aartbik added a reviewer: bkramer.
ftynse accepted this revision.Jun 30 2020, 1:04 AM
This revision is now accepted and ready to land.Jun 30 2020, 1:04 AM
This revision was automatically updated to reflect the committed changes.