This is an archive of the discontinued LLVM Phabricator instance.

[Matrix] Use fmuladd for matrix.multiply if allowed.
ClosedPublic

Authored by fhahn on Dec 3 2019, 3:45 AM.

Details

Summary

If the matrix.multiply calls have the contract fast math flag, we can
use fmuladd. This als adds a command line option to force fmuladd
generation. We can retire this option once there is a clang-level
option.

Diff Detail

Event Timeline

fhahn created this revision.Dec 3 2019, 3:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 3 2019, 3:45 AM

Build result: FAILURE - Could not check out parent git hash "0e07e583a225ea05fa811c2ee2b64c757c8efe0e". It was not found in the repository. Did you configure the "Parent Revision" in Phabricator properly? Trying to apply the patch to the master branch instead...

ERROR: arc patch failed with error code 1. Check build log for details.
Log files: console-log.txt, CMakeCache.txt

fhahn updated this revision to Diff 233637.Dec 12 2019, 9:09 AM

Rebased test.

We should be able to use fmuladd instead of fmul/fadd, as this just
reduces the rounding error between operations.

I am concerned about the stability of the result in the presence of the FMA and non-FMA targets. We ought to generate the same result in IEEE fashion but with this we wouldn't. I think that this should still be driven by the fp-contract fp flag and perhaps we should have command-line option that enables this specifically for matmul?

fhahn updated this revision to Diff 234600.Dec 18 2019, 12:51 PM

I am concerned about the stability of the result in the presence of the FMA and non-FMA targets. We ought to generate the same result in IEEE fashion but with this we wouldn't. I think that this should still be driven by the fp-contract fp flag and perhaps we should have command-line option that enables this specifically for matmul?

Agreed! I've added a matrix-allow-contraction option. I'll add support for using fast-math flags as follow up

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

anemet added inline comments.Dec 19 2019, 2:01 PM
llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp
50

Shouldn't this be using the fpflags from the intrinsic? I.e. I was thinking of a clang flag.

anemet requested changes to this revision.Dec 19 2019, 3:15 PM
This revision now requires changes to proceed.Dec 19 2019, 3:15 PM
fhahn marked an inline comment as done.Dec 19 2019, 3:46 PM
fhahn added inline comments.
llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp
50

We can certainly add this on the clang level, although I am not sure how easy/hard it will be to add additional flags there.

anemet added inline comments.Dec 20 2019, 9:10 AM
llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp
50

Can we do both? I.e. propagate fp-contract from the intrinsic's fpflag and also allow the backend to globally enable this? I'd be fine with that.

fhahn updated this revision to Diff 234955.Dec 20 2019, 12:52 PM

Support contraction if contract fast math flag is present.

fhahn retitled this revision from [Matrix] Use fmuladd when lowering matrix.multiply. to [Matrix] Use fmuladd for matrix.multiply if allowed..Dec 20 2019, 12:54 PM
fhahn edited the summary of this revision. (Show Details)
fhahn marked an inline comment as done.
fhahn added inline comments.
llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp
50

Yep I've updated it to respect the FMF and also kept the option

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

fhahn updated this revision to Diff 234961.Dec 20 2019, 1:06 PM

Reduce contract test cases, as we just need to check one call per type.

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

anemet accepted this revision.Dec 20 2019, 2:03 PM

LGTM

This revision is now accepted and ready to land.Dec 20 2019, 2:03 PM
This revision was automatically updated to reflect the committed changes.