This patch makes it so, floating-point instructions created in
LowerMatrixIntrinsics retain fast-math flags from instructions that are
higher up the chain.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thanks for the patch! Could you add test coverage for lowering of the other FP instructions (fsub, fmul, fneg)? And maybe also with some variations of individual FMFs.
llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp | ||
---|---|---|
1165–1167 | I think it would be cleaner to just pass in the FMF object here? We don't need the instruction for anything else. Also, I think we don't need to separately pass in AllowContraction any longer? | |
llvm/test/Transforms/LowerMatrixIntrinsics/preserve-existing-fast-math-flags.ll | ||
17 | Is there a reason we need the insert element instructions here, rather than passing in the second operand as vector to the function as well? |
llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp | ||
---|---|---|
1165–1167 | +1 |
Pass FastMathFlags directly into emitMatrixMultiply and remove the
AllowContraction operator from emitMatrixMultiply
Thanks for the update! A few more suggestions about dealing with the FMF object.
llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp | ||
---|---|---|
407 | Instead of returning None, could we just return a FastMathFlags object with all options set the false (which is the default) ? | |
1187 | I think this should be the other way around, if AllowContractEnabled we could set AllowContraction on FMF. This default could be set in getFastMathFlags? | |
1692 | Can we instead use the FMF object in those versions instead? AllowContraction option should probably also add the contract flag to those. |
LGTM with the inline nits. Thanks!
llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp | ||
---|---|---|
410 | nit: drop {} | |
llvm/test/Transforms/LowerMatrixIntrinsics/preserve-existing-fast-math-flags.ll | ||
179 | could you split up the test, possibly into separate ones for each multiply? Otherwise the IR to check for a single function seems unnecessarily big. |
Instead of returning None, could we just return a FastMathFlags object with all options set the false (which is the default) ?