Download Raw Diff

Details

Reviewers

fhahn
bkramer
mibintc
kpn

Commits

rGbe2277fbf233: [Matrix] Support #pragma clang fp

Summary

From https://bugs.llvm.org/show_bug.cgi?id=49739:

Currently, #pragma clang fp are ignored for matrix types.

For the code below, the contract fast-math flag should be added to the generated call to llvm.matrix.multiply and fadd

typedef float fx2x2_t __attribute__((matrix_type(2, 2)));

void foo(fx2x2_t &A, fx2x2_t &C, fx2x2_t &B) {
  #pragma clang fp contract(fast)
  C = A*B + C;
}

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

effective-light created this revision.Apr 20 2021, 2:43 AM

Herald added a subscriber: tschuett. · View Herald TranscriptApr 20 2021, 2:43 AM

effective-light requested review of this revision.Apr 20 2021, 2:43 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2021, 2:43 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

effective-light added reviewers: fhahn, bkramer.Apr 20 2021, 2:44 AM

Awesome, thanks for the patch! I'm also adding @mibintc who I think was working on adding support for the pragma & co.

Could you add a brief description to the patch and also reference the bug report on bugs.llvm.org?

clang/test/CodeGen/fp-matrix-pragma.c
2	I don't think we need `--target=arm64-unknown-iphoneos` or `-g0`?
24	could you also add a float test that has both `contract(fast) reassociate(on)`?

effective-light edited the summary of this revision. (Show Details)Apr 20 2021, 3:10 AM

Remove the unnecessary flags and add another test

effective-light marked 2 inline comments as done.Apr 20 2021, 4:04 AM

Harbormaster completed remote builds in B99662: Diff 338785.Apr 20 2021, 4:17 AM

Harbormaster completed remote builds in B99683: Diff 338812.Apr 20 2021, 4:21 AM

The diff appears to be 2 separate commits, so on first glance this is only patching the test files. Usually if I am working on a patch and have responded to comments, I compress the patch+updates into a single commit (git rebase -i) before creating a diff to upload to Phabricator.

Are there Matrix builtins that you wish to have decorated with the FMF? e.g. look at CGBuiltin.cpp

mibintc added a reviewer: kpn.Apr 20 2021, 7:07 AM

Whoops

I don't know the matrix implementation so I can't swear this hits every place needed, but the uses of CGFPOptionsRAII in this patch look correct at least.

mibintc accepted this revision.Apr 20 2021, 12:27 PM

This revision is now accepted and ready to land.Apr 20 2021, 12:27 PM

Harbormaster completed remote builds in B99766: Diff 338941.Apr 20 2021, 12:32 PM

Fix build

Harbormaster completed remote builds in B99819: Diff 339012.Apr 20 2021, 4:08 PM

Use the correct clang command.

Harbormaster completed remote builds in B99851: Diff 339048.Apr 20 2021, 6:38 PM

In D100834#2702550, @kpn wrote:

I don't know the matrix implementation so I can't swear this hits every place needed, but the uses of CGFPOptionsRAII in this patch look correct at least.

Other parts of the extension include __builtin_matrix_transpose, indexing into a matrix and casting, but I don't think the FMFs are needed there. One thing that would be good to also test would be the compound operators, (-=, +=, *=). @effective-light it would be great if you could add a test for those, then LGTM from my side. If you need someone to commit the change on your behalf, please let us know and share the name + email to use for the commit authorship.

Add a test for compound operations

In D100834#2704330, @fhahn wrote:

In D100834#2702550, @kpn wrote:

I don't know the matrix implementation so I can't swear this hits every place needed, but the uses of CGFPOptionsRAII in this patch look correct at least.

Other parts of the extension include __builtin_matrix_transpose, indexing into a matrix and casting, but I don't think the FMFs are needed there. One thing that would be good to also test would be the compound operators, (-=, +=, *=). @effective-light it would be great if you could add a test for those, then LGTM from my side. If you need someone to commit the change on your behalf, please let us know and share the name + email to use for the commit authorship.

Ya, you can commit it under name: Hamza Mahfooz, email: someguy@effective-light.com, thanks.

Harbormaster completed remote builds in B100053: Diff 339325.Apr 21 2021, 12:30 PM

LGTM, thanks!

I'll commit the patch shortly.

Closed by commit rGbe2277fbf233: [Matrix] Support #pragma clang fp (authored by effective-light, committed by fhahn). · Explain WhyApr 22 2021, 3:46 AM

This revision was automatically updated to reflect the committed changes.

fhahn added a commit: rGbe2277fbf233: [Matrix] Support #pragma clang fp.

@effective-light just FYI, if you are interested in improving fast-math flags handling during matrix lowering on the LLVM side, there's https://bugs.llvm.org/show_bug.cgi?id=49738

Diff 339549

clang/lib/CodeGen/CGExprScalar.cpp

Show First 20 Lines • Show All 726 Lines • ▼ Show 20 Lines	if (Ops.Ty->isConstantMatrixType()) {
llvm::MatrixBuilder<CGBuilderTy> MB(Builder);		llvm::MatrixBuilder<CGBuilderTy> MB(Builder);
// We need to check the types of the operands of the operator to get the		// We need to check the types of the operands of the operator to get the
// correct matrix dimensions.		// correct matrix dimensions.
auto *BO = cast<BinaryOperator>(Ops.E);		auto *BO = cast<BinaryOperator>(Ops.E);
auto *LHSMatTy = dyn_cast<ConstantMatrixType>(		auto *LHSMatTy = dyn_cast<ConstantMatrixType>(
BO->getLHS()->getType().getCanonicalType());		BO->getLHS()->getType().getCanonicalType());
auto *RHSMatTy = dyn_cast<ConstantMatrixType>(		auto *RHSMatTy = dyn_cast<ConstantMatrixType>(
BO->getRHS()->getType().getCanonicalType());		BO->getRHS()->getType().getCanonicalType());
		CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, Ops.FPFeatures);
if (LHSMatTy && RHSMatTy)		if (LHSMatTy && RHSMatTy)
return MB.CreateMatrixMultiply(Ops.LHS, Ops.RHS, LHSMatTy->getNumRows(),		return MB.CreateMatrixMultiply(Ops.LHS, Ops.RHS, LHSMatTy->getNumRows(),
LHSMatTy->getNumColumns(),		LHSMatTy->getNumColumns(),
RHSMatTy->getNumColumns());		RHSMatTy->getNumColumns());
return MB.CreateScalarMultiply(Ops.LHS, Ops.RHS);		return MB.CreateScalarMultiply(Ops.LHS, Ops.RHS);
}		}

if (Ops.Ty->isUnsignedIntegerType() &&		if (Ops.Ty->isUnsignedIntegerType() &&
▲ Show 20 Lines • Show All 2,458 Lines • ▼ Show 20 Lines	if (Ops.Ty->isConstantMatrixType()) {
// correct matrix dimensions.		// correct matrix dimensions.
auto *BO = cast<BinaryOperator>(Ops.E);		auto *BO = cast<BinaryOperator>(Ops.E);
(void)BO;		(void)BO;
assert(		assert(
isa<ConstantMatrixType>(BO->getLHS()->getType().getCanonicalType()) &&		isa<ConstantMatrixType>(BO->getLHS()->getType().getCanonicalType()) &&
"first operand must be a matrix");		"first operand must be a matrix");
assert(BO->getRHS()->getType().getCanonicalType()->isArithmeticType() &&		assert(BO->getRHS()->getType().getCanonicalType()->isArithmeticType() &&
"second operand must be an arithmetic type");		"second operand must be an arithmetic type");
		CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, Ops.FPFeatures);
return MB.CreateScalarDiv(Ops.LHS, Ops.RHS,		return MB.CreateScalarDiv(Ops.LHS, Ops.RHS,
Ops.Ty->hasUnsignedIntegerRepresentation());		Ops.Ty->hasUnsignedIntegerRepresentation());
}		}

if (Ops.LHS->getType()->isFPOrFPVectorTy()) {		if (Ops.LHS->getType()->isFPOrFPVectorTy()) {
llvm::Value *Val;		llvm::Value *Val;
CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, Ops.FPFeatures);		CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, Ops.FPFeatures);
Val = Builder.CreateFDiv(Ops.LHS, Ops.RHS, "div");		Val = Builder.CreateFDiv(Ops.LHS, Ops.RHS, "div");
▲ Show 20 Lines • Show All 363 Lines • ▼ Show 20 Lines	case LangOptions::SOB_Trapping:
if (CanElideOverflowCheck(CGF.getContext(), op))		if (CanElideOverflowCheck(CGF.getContext(), op))
return Builder.CreateNSWAdd(op.LHS, op.RHS, "add");		return Builder.CreateNSWAdd(op.LHS, op.RHS, "add");
return EmitOverflowCheckedBinOp(op);		return EmitOverflowCheckedBinOp(op);
}		}
}		}

if (op.Ty->isConstantMatrixType()) {		if (op.Ty->isConstantMatrixType()) {
llvm::MatrixBuilder<CGBuilderTy> MB(Builder);		llvm::MatrixBuilder<CGBuilderTy> MB(Builder);
		CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, op.FPFeatures);
return MB.CreateAdd(op.LHS, op.RHS);		return MB.CreateAdd(op.LHS, op.RHS);
}		}

if (op.Ty->isUnsignedIntegerType() &&		if (op.Ty->isUnsignedIntegerType() &&
CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow) &&		CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow) &&
!CanElideOverflowCheck(CGF.getContext(), op))		!CanElideOverflowCheck(CGF.getContext(), op))
return EmitOverflowCheckedBinOp(op);		return EmitOverflowCheckedBinOp(op);

▲ Show 20 Lines • Show All 133 Lines • ▼ Show 20 Lines	if (op.Ty->isSignedIntegerOrEnumerationType()) {
if (CanElideOverflowCheck(CGF.getContext(), op))		if (CanElideOverflowCheck(CGF.getContext(), op))
return Builder.CreateNSWSub(op.LHS, op.RHS, "sub");		return Builder.CreateNSWSub(op.LHS, op.RHS, "sub");
return EmitOverflowCheckedBinOp(op);		return EmitOverflowCheckedBinOp(op);
}		}
}		}

if (op.Ty->isConstantMatrixType()) {		if (op.Ty->isConstantMatrixType()) {
llvm::MatrixBuilder<CGBuilderTy> MB(Builder);		llvm::MatrixBuilder<CGBuilderTy> MB(Builder);
		CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, op.FPFeatures);
return MB.CreateSub(op.LHS, op.RHS);		return MB.CreateSub(op.LHS, op.RHS);
}		}

if (op.Ty->isUnsignedIntegerType() &&		if (op.Ty->isUnsignedIntegerType() &&
CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow) &&		CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow) &&
!CanElideOverflowCheck(CGF.getContext(), op))		!CanElideOverflowCheck(CGF.getContext(), op))
return EmitOverflowCheckedBinOp(op);		return EmitOverflowCheckedBinOp(op);

▲ Show 20 Lines • Show All 1,373 Lines • Show Last 20 Lines

clang/test/CodeGen/fp-matrix-pragma.c

This file was added.

				// RUN: %clang -emit-llvm -S -fenable-matrix -mllvm -disable-llvm-optzns %s -o - \| FileCheck %s

				fhahnUnsubmitted Done Reply Inline Actions I don't think we need `--target=arm64-unknown-iphoneos` or `-g0`? fhahn: I don't think we need `--target=arm64-unknown-iphoneos` or `-g0`?
				typedef float fx2x2_t __attribute__((matrix_type(2, 2)));
				typedef int ix2x2_t __attribute__((matrix_type(2, 2)));

				fx2x2_t fp_matrix_contract(fx2x2_t a, fx2x2_t b, float c, float d) {
				// CHECK: call contract <4 x float> @llvm.matrix.multiply.v4f32.v4f32.v4f32
				// CHECK: fdiv contract <4 x float>
				// CHECK: fmul contract <4 x float>
				#pragma clang fp contract(fast)
				return (a * b / c) * d;
				}

				fx2x2_t fp_matrix_reassoc(fx2x2_t a, fx2x2_t b, fx2x2_t c) {
				// CHECK: fadd reassoc <4 x float>
				// CHECK: fsub reassoc <4 x float>
				#pragma clang fp reassociate(on)
				return a + b - c;
				}

				fx2x2_t fp_matrix_ops(fx2x2_t a, fx2x2_t b, fx2x2_t c) {
				// CHECK: call reassoc contract <4 x float> @llvm.matrix.multiply.v4f32.v4f32.v4f32
				// CHECK: fadd reassoc contract <4 x float>
				#pragma clang fp contract(fast) reassociate(on)
				fhahnUnsubmitted Done Reply Inline Actions could you also add a float test that has both `contract(fast) reassociate(on)`? fhahn: could you also add a float test that has both `contract(fast) reassociate(on)`?
				return a * b + c;
				}

				fx2x2_t fp_matrix_compound_ops(fx2x2_t a, fx2x2_t b, fx2x2_t c, fx2x2_t d,
				float e, float f) {
				// CHECK: call reassoc contract <4 x float> @llvm.matrix.multiply.v4f32.v4f32.v4f32
				// CHECK: fadd reassoc contract <4 x float>
				// CHECK: fsub reassoc contract <4 x float>
				// CHECK: fmul reassoc contract <4 x float>
				// CHECK: fdiv reassoc contract <4 x float>
				#pragma clang fp contract(fast) reassociate(on)
				a *= b;
				a += c;
				a -= d;
				a *= e;
				a /= f;

				return a;
				}

				ix2x2_t int_matrix_ops(ix2x2_t a, ix2x2_t b, ix2x2_t c) {
				// CHECK: call <4 x i32> @llvm.matrix.multiply.v4i32.v4i32.v4i32
				// CHECK: add <4 x i32>
				#pragma clang fp contract(fast) reassociate(on)
				return a * b + c;
				}

This is an archive of the discontinued LLVM Phabricator instance.

Bug 49739 - [Matrix] Support #pragma clang fp
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 339549

clang/lib/CodeGen/CGExprScalar.cpp

clang/test/CodeGen/fp-matrix-pragma.c

This is an archive of the discontinued LLVM Phabricator instance.

Bug 49739 - [Matrix] Support #pragma clang fpClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 339549

clang/lib/CodeGen/CGExprScalar.cpp

clang/test/CodeGen/fp-matrix-pragma.c

Bug 49739 - [Matrix] Support #pragma clang fp
ClosedPublic