Download Raw Diff

Details

Reviewers

fhahn
arsenm

Commits

rG5c57e6aa5777: [Clang] Extend emitUnaryBuiltin to avoid duplicate logic.

Summary

This patch extends emitUnaryBuiltin so that we can better emitting IR when
implement builtins specified in D111529.

Also contains some NFC, applying it to existing code.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

junaire requested review of this revision.Dec 22 2021, 4:56 AM

junaire created this revision.

Herald added a project: Restricted Project. · View Herald TranscriptDec 22 2021, 4:56 AM

Herald added subscribers: cfe-commits, wdng. · View Herald Transcript

junaire mentioned this in D115429: [Clang] Implement the rest of __builtin_elementwise_* functions..Dec 22 2021, 5:00 AM

Can you also update the existing places that could use it?

Harbormaster completed remote builds in B140377: Diff 395844.Dec 22 2021, 5:27 AM

Update the existing place that can use emitUnaryBuiltin.

Sorry, It seems that the base branch is wrong, reupdate it.

In D116161#3206442, @junaire wrote:

Update the existing place that can use emitUnaryBuiltin.

I meant just update the existing uses *without* adding floor, roundeven, trunc, so this change should be NFC (non-functional change)

In D116161#3206447, @fhahn wrote:

In D116161#3206442, @junaire wrote:

Update the existing place that can use emitUnaryBuiltin.

I meant just update the existing uses *without* adding floor, roundeven, trunc, so this change should be NFC (non-functional change)

I'm sorry about it, I'm just too careless. :-(

Harbormaster completed remote builds in B140383: Diff 395853.Dec 22 2021, 6:07 AM

Update the existing place that can use emitUnaryBuiltin.

Harbormaster completed remote builds in B140392: Diff 395864.Dec 22 2021, 7:27 AM

Fix wrong usage.

fhahn added inline comments.Dec 24 2021, 12:44 AM

clang/lib/CodeGen/CGBuiltin.cpp
3134–3135	Should also be used here?
3187–3188	Should also be used here?
3202–3203	Should also be used here?

Harbormaster completed remote builds in B140592: Diff 396140.Dec 24 2021, 12:45 AM

Should also be used here?

I don't know why but these will cause tests to fail.
Maybe it is caused by my local cache? I'm gonna make a clean build to test it again.
BTW, thanks for taking look at this. Merry Christmas! ;D

I confirmed that we can use emitUnaryBuiltin in the cases you pointed out. Please see the logs below:

$ path/to/llvm-project/build/bin/clang -cc1 -internal-isystem /path/to/llvm-project/build/lib/clang/14.0.0/include -nostdsysteminc -triple x86_64-apple-darwin /path/to/llvm-project/clang/test/CodeGen/builtins-elementwise-math.c -emit-llvm -disable-llvm-passes -o - | /path/to/llvm-project/build/bin/FileCheck /path/to/llvm-project/clang/test/CodeGen/builtins-elementwise-math.c
/path/to/llvm-project/clang/test/CodeGen/builtins-elementwise-math.c:16:17: error: CHECK-NEXT: expected string not found in input
 // CHECK-NEXT: call float @llvm.fabs.f32(float [[F1]])
                ^
<stdin>:35:43: note: scanning from here
 %0 = load float, float* %f1.addr, align 4
                                          ^
<stdin>:35:43: note: with "F1" equal to "%0"
 %0 = load float, float* %f1.addr, align 4
                                          ^
<stdin>:37:13: note: possible intended match here
 %elt.abs = call float @llvm.fabs.f32(float %1)
            ^

Input file: <stdin>
Check file: /path/to/llvm-project/clang/test/CodeGen/builtins-elementwise-math.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          30:  store <8 x i16> %vi1, <8 x i16>* %vi1.addr, align 16 
          31:  store <8 x i16> %vi2, <8 x i16>* %vi2.addr, align 16 
          32:  store i64 %i1, i64* %i1.addr, align 8 
          33:  store i64 %i2, i64* %i2.addr, align 8 
          34:  store i16 %si, i16* %si.addr, align 2 
          35:  %0 = load float, float* %f1.addr, align 4 
next:16'0                                               X error: no match found
next:16'1                                                 with "F1" equal to "%0"
          36:  %1 = load float, float* %f1.addr, align 4 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          37:  %elt.abs = call float @llvm.fabs.f32(float %1) 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:16'2                 ?                                    possible intended match
          38:  store float %elt.abs, float* %f2.addr, align 4 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          39:  %2 = load double, double* %d1.addr, align 8 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          40:  %3 = load double, double* %d1.addr, align 8 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          41:  %elt.abs1 = call double @llvm.fabs.f64(double %3) 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          42:  store double %elt.abs1, double* %d2.addr, align 8 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

$ /path/to/llvm-project/build/bin/clang -cc1 -internal-isystem /path/to/llvm-project/build/lib/clang/14.0.0/include -nostdsysteminc -triple x86_64-apple-darwin ~/dev/cpp-projects/llvm-project/clang/test/CodeGen/builtins-reduction-math.c -emit-llvm -disable-llvm-passes -o - | ~/dev/cpp-projects/llvm-project/build/bin/FileCheck ~/dev/cpp-projects/llvm-project/clang/test/CodeGen/builtins-reduction-math.c
/path/to/llvm-project/clang/test/CodeGen/builtins-reduction-math.c:12:17: error: CHECK-NEXT: expected string not found in input
 // CHECK-NEXT: call float @llvm.vector.reduce.fmax.v4f32(<4 x float> [[VF1]])
                ^
<stdin>:23:57: note: scanning from here
 %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16
                                                        ^
<stdin>:23:57: note: with "VF1" equal to "%0"
 %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16
                                                        ^
<stdin>:25:13: note: possible intended match here
 %rdx.min = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %1)
            ^
/path/to/llvm-project/clang/test/CodeGen/builtins-reduction-math.c:38:17: error: CHECK-NEXT: expected string not found in input
 // CHECK-NEXT: call float @llvm.vector.reduce.fmin.v4f32(<4 x float> [[VF1]])
                ^
<stdin>:74:57: note: scanning from here
 %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16
                                                        ^
<stdin>:74:57: note: with "VF1" equal to "%0"
 %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16
                                                        ^
<stdin>:76:13: note: possible intended match here
 %rdx.min = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %1)
            ^

Input file: <stdin>
Check file: /path/to/llvm-project/clang/test/CodeGen/builtins-reduction-math.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          18:  %cvi1 = alloca <8 x i16>, align 16 
          19:  %r5 = alloca i64, align 8 
          20:  store <4 x float> %vf1, <4 x float>* %vf1.addr, align 16 
          21:  store <8 x i16> %vi1, <8 x i16>* %vi1.addr, align 16 
          22:  store <4 x i32> %vu1, <4 x i32>* %vu1.addr, align 16 
          23:  %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16 
next:12'0                                                             X error: no match found
next:12'1                                                               with "VF1" equal to "%0"
          24:  %1 = load <4 x float>, <4 x float>* %vf1.addr, align 16 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          25:  %rdx.min = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %1) 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:12'2                 ?                                                          possible intended match
          26:  store float %rdx.min, float* %r1, align 4 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          27:  %2 = load <8 x i16>, <8 x i16>* %vi1.addr, align 16 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          28:  %3 = load <8 x i16>, <8 x i16>* %vi1.addr, align 16 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          29:  %rdx.min1 = call i16 @llvm.vector.reduce.smax.v8i16(<8 x i16> %3) 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          30:  store i16 %rdx.min1, i16* %r2, align 2 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          69:  %cvi1 = alloca <8 x i16>, align 16 
          70:  %r5 = alloca i64, align 8 
          71:  store <4 x float> %vf1, <4 x float>* %vf1.addr, align 16 
          72:  store <8 x i16> %vi1, <8 x i16>* %vi1.addr, align 16 
          73:  store <4 x i32> %vu1, <4 x i32>* %vu1.addr, align 16 
          74:  %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16 
next:38'0                                                             X error: no match found
next:38'1                                                               with "VF1" equal to "%0"
          75:  %1 = load <4 x float>, <4 x float>* %vf1.addr, align 16 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          76:  %rdx.min = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %1) 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:38'2                 ?                                                          possible intended match
          77:  store float %rdx.min, float* %r1, align 4 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          78:  %2 = load <8 x i16>, <8 x i16>* %vi1.addr, align 16 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          79:  %3 = load <8 x i16>, <8 x i16>* %vi1.addr, align 16 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          80:  %rdx.min1 = call i16 @llvm.vector.reduce.smin.v8i16(<8 x i16> %3) 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          81:  store i16 %rdx.min1, i16* %r2, align 2 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

Do you know why these will happen? @fhahn

In D116161#3209286, @junaire wrote:

35:  %0 = load float, float* %f1.addr, align 4 
36:  %1 = load float, float* %f1.addr, align 4 
37:  %elt.abs = call float @llvm.fabs.f32(float %1)

It looks like the argument expression is evaluated twice. Did you remove the Value *Op0 = EmitScalarExpr(E->getArg(0)); calls?

In D116161#3209292, @fhahn wrote:
In D116161#3209286, @junaire wrote:
35:  %0 = load float, float* %f1.addr, align 4 
36:  %1 = load float, float* %f1.addr, align 4 
37:  %elt.abs = call float @llvm.fabs.f32(float %1)
It looks like the argument expression is evaluated twice. Did you remove the Value *Op0 = EmitScalarExpr(E->getArg(0)); calls?

Well, for example:
For __builtin_elementwise_abs, we have code like:

case Builtin::BI__builtin_elementwise_abs: {
  Value *Op0 = EmitScalarExpr(E->getArg(0));
  Value *Result;
  if (Op0->getType()->isIntOrIntVectorTy())
    Result = Builder.CreateBinaryIntrinsic(
        llvm::Intrinsic::abs, Op0, Builder.getFalse(), nullptr, "elt.abs");
  else
    Result = Builder.CreateUnaryIntrinsic(llvm::Intrinsic::fabs, Op0, nullptr,
                                          "elt.abs");

  return RValue::get(Result);
}

If we use emitUnaryBuiltin, we are supposed to do something like:

Result = emitUnaryBuiltin(*this, E, llvm::Intrinsic::fabs, "elt.abs");

We need to pass CallExpr* E to the function to meet its interface, and it calls CGF.EmitScalarExpr(E->getArg(0)); inside. Maybe this is what you said about the argument expression being evaluated twice? I think it is unavoidable if we don't change the function's interface.

Maybe we can have something like:

static Value *emitUnaryBuiltin(CodeGenFunction &CGF, Value* Op0,
                               unsigned IntrinsicID, llvm::StringRef Name) {
  return CGF.Builder.CreateUnaryIntrinsic(IntrinsicID, Op0, nullptr, Name);
}

Then for __builtin_elementwise_abs we can have:

Result = emitUnaryBuiltin(*this, Op0, llvm::Intrinsic::fabs, "elt.abs");

and for __builtin_elementwise_ceil have:

return RValue::get(
    emitUnaryBuiltin(*this, EmitScalarExpr(E->getArg(0)), llvm::Intrinsic::ceil, "elt.ceil"));

WDYT? Well, franking speaking I think this one-line function is ugly but I can't come up with a more elegant solution, I would appreciate it if you can offer some suggestions. :)

In order to use emitUnaryBuiltin in other cases, I changed the function interface.
This allows us to use it in all Builder.CreateUnaryIntrinsic() cases, but will make
the function body very small.

Harbormaster completed remote builds in B140779: Diff 396380.Dec 28 2021, 12:59 AM

In D116161#3211178, @junaire wrote:

In order to use emitUnaryBuiltin in other cases, I changed the function interface.
This allows us to use it in all Builder.CreateUnaryIntrinsic() cases, but will make
the function body very small.

I think we should extend & use the existing emitUnaryBuiltin. I think in most cases where it cannot be used straight away you should be able to slightly rewrite the existing code to not rely on llvm::Type. Then there should be no need to call EmitScalarExpr early (an example is in the inline comments)

clang/lib/CodeGen/CGBuiltin.cpp

534–535

I think we should extend this emitUnaryBuiltin function, rather than having a second one.

e.g.

 static Value *emitUnaryBuiltin(CodeGenFunction &CGF,
                                const CallExpr *E,
-                               unsigned IntrinsicID) {
+                               unsigned IntrinsicID, llvm::StringRef Name = "") {
   llvm::Value *Src0 = CGF.EmitScalarExpr(E->getArg(0));

   Function *F = CGF.CGM.getIntrinsic(IntrinsicID, Src0->getType());
-  return CGF.Builder.CreateCall(F, Src0);
+  return CGF.Builder.CreateCall(F, Src0, Name);
+}

3187–3188

IICU the only reason to call EmitScalarExpr early is the use in GetIntrinsicID, but it could solely rely on QualType:

-    auto GetIntrinsicID = [](QualType QT, llvm::Type *IrTy) {
-      if (IrTy->isIntOrIntVectorTy()) {
-        if (auto *VecTy = QT->getAs<VectorType>())
-          QT = VecTy->getElementType();
-        if (QT->isSignedIntegerType())
-          return llvm::Intrinsic::vector_reduce_smax;
-        else
-          return llvm::Intrinsic::vector_reduce_umax;
-      }
+    auto GetIntrinsicID = [](QualType QT) {
+      if (auto *VecTy = QT->getAs<VectorType>())
+        QT = VecTy->getElementType();
+      if (QT->isSignedIntegerType())
+        return llvm::Intrinsic::vector_reduce_smax;
+      if (QT->isUnsignedIntegerType())
+        return llvm::Intrinsic::vector_reduce_umax;
+      assert(QT->isFloatingType() && "must have a float here");
       return llvm::Intrinsic::vector_reduce_fmax;
     };

Extend emitUnaryBuiltin instead of adding a new overload, also apply it to __builtin_elementwise_abs.

junaire retitled this revision from [Clang] Add an overload for emitUnaryBuiltin. to [Clang] Extend emitUnaryBuiltin to avoid duplicate logic..Dec 28 2021, 11:18 PM

junaire edited the summary of this revision. (Show Details)

Refactor code a little bit and fix wrong names.

Harbormaster completed remote builds in B140865: Diff 396492.Dec 29 2021, 12:27 AM

LGTM, thanks!

This revision is now accepted and ready to land.Jan 2 2022, 1:08 PM

@junaire please let me know if you want me to land this on your behalf.

In D116161#3219080, @fhahn wrote:

@junaire please let me know if you want me to land this on your behalf.

Yeah, thanks a lot!

You can use:
Jun Zhang
jun@junz.org

This revision was landed with ongoing or failed builds.Jan 4 2022, 3:48 AM

Closed by commit rG5c57e6aa5777: [Clang] Extend emitUnaryBuiltin to avoid duplicate logic. (authored by junaire, committed by fhahn). · Explain Why

This revision was automatically updated to reflect the committed changes.

fhahn added a commit: rG5c57e6aa5777: [Clang] Extend emitUnaryBuiltin to avoid duplicate logic..

In D116161#3219149, @junaire wrote:

In D116161#3219080, @fhahn wrote:

@junaire please let me know if you want me to land this on your behalf.

Yeah, thanks a lot!

You can use:
Jun Zhang
jun@junz.org

Committed! After D115429, I'd recommend you obtain commit access https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access

In D116161#3219343, @fhahn wrote:

In D116161#3219149, @junaire wrote:

In D116161#3219080, @fhahn wrote:

@junaire please let me know if you want me to land this on your behalf.

Yeah, thanks a lot!

You can use:
Jun Zhang
jun@junz.org

Committed! After D115429, I'd recommend you obtain commit access https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access

That's so great! Thank you for your continued patient guidance and the review work, I really appreciate!

fhahn added a reverting change: rGf552ba6e8405: Revert "[Clang] Extend emitUnaryBuiltin to avoid duplicate logic.".Jan 4 2022, 5:46 AM

Diff 397254

clang/lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 525 Lines • ▼ Show 20 Lines	static Value *emitCallMaybeConstrainedFPBuiltin(CodeGenFunction &CGF,

if (CGF.Builder.getIsFPConstrained())		if (CGF.Builder.getIsFPConstrained())
return CGF.Builder.CreateConstrainedFPCall(F, Args);		return CGF.Builder.CreateConstrainedFPCall(F, Args);
else		else
return CGF.Builder.CreateCall(F, Args);		return CGF.Builder.CreateCall(F, Args);
}		}

// Emit a simple mangled intrinsic that has 1 argument and a return type		// Emit a simple mangled intrinsic that has 1 argument and a return type
// matching the argument type.		// matching the argument type.
static Value *emitUnaryBuiltin(CodeGenFunction &CGF,		static Value emitUnaryBuiltin(CodeGenFunction &CGF, const CallExpr E,
		fhahnUnsubmitted Not Done Reply Inline Actions I think we should extend this `emitUnaryBuiltin` function, rather than having a second one. e.g. static Value emitUnaryBuiltin(CodeGenFunction &CGF, const CallExpr E, - unsigned IntrinsicID) { + unsigned IntrinsicID, llvm::StringRef Name = "") { llvm::Value Src0 = CGF.EmitScalarExpr(E->getArg(0)); Function F = CGF.CGM.getIntrinsic(IntrinsicID, Src0->getType()); - return CGF.Builder.CreateCall(F, Src0); + return CGF.Builder.CreateCall(F, Src0, Name); +} fhahn: I think we should extend this `emitUnaryBuiltin` function, rather than having a second one. e.
const CallExpr *E,		unsigned IntrinsicID,
unsigned IntrinsicID) {		llvm::StringRef Name = "") {
llvm::Value *Src0 = CGF.EmitScalarExpr(E->getArg(0));		llvm::Value *Src0 = CGF.EmitScalarExpr(E->getArg(0));

Function *F = CGF.CGM.getIntrinsic(IntrinsicID, Src0->getType());		Function *F = CGF.CGM.getIntrinsic(IntrinsicID, Src0->getType());
return CGF.Builder.CreateCall(F, Src0);		return CGF.Builder.CreateCall(F, Src0, Name);
}		}

// Emit an intrinsic that has 2 operands of the same type as its result.		// Emit an intrinsic that has 2 operands of the same type as its result.
static Value *emitBinaryBuiltin(CodeGenFunction &CGF,		static Value *emitBinaryBuiltin(CodeGenFunction &CGF,
const CallExpr *E,		const CallExpr *E,
unsigned IntrinsicID) {		unsigned IntrinsicID) {
llvm::Value *Src0 = CGF.EmitScalarExpr(E->getArg(0));		llvm::Value *Src0 = CGF.EmitScalarExpr(E->getArg(0));
llvm::Value *Src1 = CGF.EmitScalarExpr(E->getArg(1));		llvm::Value *Src1 = CGF.EmitScalarExpr(E->getArg(1));
▲ Show 20 Lines • Show All 2,567 Lines • ▼ Show 20 Lines	case Builtin::BI__builtin_isnan: {
// V = sign bit (Sub) <=> V = (Sub < 0)		// V = sign bit (Sub) <=> V = (Sub < 0)
V = Builder.CreateLShr(Sub, llvm::ConstantInt::get(IntTy, bitsize - 1));		V = Builder.CreateLShr(Sub, llvm::ConstantInt::get(IntTy, bitsize - 1));
if (bitsize > 32)		if (bitsize > 32)
V = Builder.CreateTrunc(V, ConvertType(E->getType()));		V = Builder.CreateTrunc(V, ConvertType(E->getType()));
return RValue::get(V);		return RValue::get(V);
}		}

case Builtin::BI__builtin_elementwise_abs: {		case Builtin::BI__builtin_elementwise_abs: {
Value *Op0 = EmitScalarExpr(E->getArg(0));
Value *Result;		Value *Result;
if (Op0->getType()->isIntOrIntVectorTy())		QualType QT = E->getArg(0)->getType();

		if (auto *VecTy = QT->getAs<VectorType>())
		QT = VecTy->getElementType();
		if (QT->isIntegerType())
Result = Builder.CreateBinaryIntrinsic(		Result = Builder.CreateBinaryIntrinsic(
llvm::Intrinsic::abs, Op0, Builder.getFalse(), nullptr, "elt.abs");		llvm::Intrinsic::abs, EmitScalarExpr(E->getArg(0)),
		Builder.getFalse(), nullptr, "elt.abs");
else		else
Result = Builder.CreateUnaryIntrinsic(llvm::Intrinsic::fabs, Op0, nullptr,		Result = emitUnaryBuiltin(*this, E, llvm::Intrinsic::fabs, "elt.abs");
		fhahnUnsubmitted Not Done Reply Inline Actions Should also be used here? fhahn: Should also be used here?
"elt.abs");
return RValue::get(Result);
}

case Builtin::BI__builtin_elementwise_ceil: {
Value *Op0 = EmitScalarExpr(E->getArg(0));
Value *Result = Builder.CreateUnaryIntrinsic(llvm::Intrinsic::ceil, Op0,
nullptr, "elt.ceil");
return RValue::get(Result);		return RValue::get(Result);
}		}

		case Builtin::BI__builtin_elementwise_ceil:
		return RValue::get(
		emitUnaryBuiltin(*this, E, llvm::Intrinsic::ceil, "elt.ceil"));

case Builtin::BI__builtin_elementwise_max: {		case Builtin::BI__builtin_elementwise_max: {
Value *Op0 = EmitScalarExpr(E->getArg(0));		Value *Op0 = EmitScalarExpr(E->getArg(0));
Value *Op1 = EmitScalarExpr(E->getArg(1));		Value *Op1 = EmitScalarExpr(E->getArg(1));
Value *Result;		Value *Result;
if (Op0->getType()->isIntOrIntVectorTy()) {		if (Op0->getType()->isIntOrIntVectorTy()) {
QualType Ty = E->getArg(0)->getType();		QualType Ty = E->getArg(0)->getType();
if (auto *VecTy = Ty->getAs<VectorType>())		if (auto *VecTy = Ty->getAs<VectorType>())
Ty = VecTy->getElementType();		Ty = VecTy->getElementType();
Show All 18 Lines	if (Op0->getType()->isIntOrIntVectorTy()) {
: llvm::Intrinsic::umin,		: llvm::Intrinsic::umin,
Op0, Op1, nullptr, "elt.min");		Op0, Op1, nullptr, "elt.min");
} else		} else
Result = Builder.CreateMinNum(Op0, Op1, "elt.min");		Result = Builder.CreateMinNum(Op0, Op1, "elt.min");
return RValue::get(Result);		return RValue::get(Result);
}		}

case Builtin::BI__builtin_reduce_max: {		case Builtin::BI__builtin_reduce_max: {
auto GetIntrinsicID = [](QualType QT, llvm::Type *IrTy) {		auto GetIntrinsicID = [](QualType QT) {
if (IrTy->isIntOrIntVectorTy()) {
if (auto *VecTy = QT->getAs<VectorType>())		if (auto *VecTy = QT->getAs<VectorType>())
QT = VecTy->getElementType();		QT = VecTy->getElementType();
if (QT->isSignedIntegerType())		if (QT->isSignedIntegerType())
return llvm::Intrinsic::vector_reduce_smax;		return llvm::Intrinsic::vector_reduce_smax;
else		if (QT->isUnsignedIntegerType())
return llvm::Intrinsic::vector_reduce_umax;		return llvm::Intrinsic::vector_reduce_umax;
}		assert(QT->isFloatingType() && "must have a float here");
return llvm::Intrinsic::vector_reduce_fmax;		return llvm::Intrinsic::vector_reduce_fmax;
};		};
Value *Op0 = EmitScalarExpr(E->getArg(0));		return RValue::get(emitUnaryBuiltin(
		fhahnUnsubmitted Not Done Reply Inline Actions Should also be used here? fhahn: Should also be used here?
		fhahnUnsubmitted Not Done Reply Inline Actions IICU the only reason to call `EmitScalarExpr` early is the use in `GetIntrinsicID`, but it could solely rely on `QualType`: - auto GetIntrinsicID = [](QualType QT, llvm::Type IrTy) { - if (IrTy->isIntOrIntVectorTy()) { - if (auto VecTy = QT->getAs<VectorType>()) - QT = VecTy->getElementType(); - if (QT->isSignedIntegerType()) - return llvm::Intrinsic::vector_reduce_smax; - else - return llvm::Intrinsic::vector_reduce_umax; - } + auto GetIntrinsicID = [](QualType QT) { + if (auto VecTy = QT->getAs<VectorType>()) + QT = VecTy->getElementType(); + if (QT->isSignedIntegerType()) + return llvm::Intrinsic::vector_reduce_smax; + if (QT->isUnsignedIntegerType()) + return llvm::Intrinsic::vector_reduce_umax; + assert(QT->isFloatingType() && "must have a float here"); return llvm::Intrinsic::vector_reduce_fmax; }; fhahn:* IICU the only reason to call `EmitScalarExpr` early is the use in `GetIntrinsicID`, but it…
Value *Result = Builder.CreateUnaryIntrinsic(		*this, E, GetIntrinsicID(E->getArg(0)->getType()), "rdx.min"));
GetIntrinsicID(E->getArg(0)->getType(), Op0->getType()), Op0, nullptr,
"rdx.min");
return RValue::get(Result);
}		}

case Builtin::BI__builtin_reduce_min: {		case Builtin::BI__builtin_reduce_min: {
auto GetIntrinsicID = [](QualType QT, llvm::Type *IrTy) {		auto GetIntrinsicID = [](QualType QT) {
if (IrTy->isIntOrIntVectorTy()) {
if (auto *VecTy = QT->getAs<VectorType>())		if (auto *VecTy = QT->getAs<VectorType>())
QT = VecTy->getElementType();		QT = VecTy->getElementType();
if (QT->isSignedIntegerType())		if (QT->isSignedIntegerType())
return llvm::Intrinsic::vector_reduce_smin;		return llvm::Intrinsic::vector_reduce_smin;
else		if (QT->isUnsignedIntegerType())
return llvm::Intrinsic::vector_reduce_umin;		return llvm::Intrinsic::vector_reduce_umin;
}		assert(QT->isFloatingType() && "must have a float here");
return llvm::Intrinsic::vector_reduce_fmin;		return llvm::Intrinsic::vector_reduce_fmin;
};		};
Value *Op0 = EmitScalarExpr(E->getArg(0));
Value *Result = Builder.CreateUnaryIntrinsic(
GetIntrinsicID(E->getArg(0)->getType(), Op0->getType()), Op0, nullptr,
"rdx.min");
return RValue::get(Result);
}

		fhahnUnsubmitted Not Done Reply Inline Actions Should also be used here? fhahn: Should also be used here?
case Builtin::BI__builtin_reduce_xor: {		return RValue::get(emitUnaryBuiltin(
Value *Op0 = EmitScalarExpr(E->getArg(0));		*this, E, GetIntrinsicID(E->getArg(0)->getType()), "rdx.min"));
Value *Result = Builder.CreateUnaryIntrinsic(
llvm::Intrinsic::vector_reduce_xor, Op0, nullptr, "rdx.xor");
return RValue::get(Result);
}		}

		case Builtin::BI__builtin_reduce_xor:
		return RValue::get(emitUnaryBuiltin(
		*this, E, llvm::Intrinsic::vector_reduce_xor, "rdx.xor"));

case Builtin::BI__builtin_matrix_transpose: {		case Builtin::BI__builtin_matrix_transpose: {
const auto *MatrixTy = E->getArg(0)->getType()->getAs<ConstantMatrixType>();		const auto *MatrixTy = E->getArg(0)->getType()->getAs<ConstantMatrixType>();
Value *MatValue = EmitScalarExpr(E->getArg(0));		Value *MatValue = EmitScalarExpr(E->getArg(0));
MatrixBuilder<CGBuilderTy> MB(Builder);		MatrixBuilder<CGBuilderTy> MB(Builder);
Value *Result = MB.CreateMatrixTranspose(MatValue, MatrixTy->getNumRows(),		Value *Result = MB.CreateMatrixTranspose(MatValue, MatrixTy->getNumRows(),
MatrixTy->getNumColumns());		MatrixTy->getNumColumns());
return RValue::get(Result);		return RValue::get(Result);
}		}
▲ Show 20 Lines • Show All 15,714 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Extend emitUnaryBuiltin to avoid duplicate logic.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 397254

clang/lib/CodeGen/CGBuiltin.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Extend emitUnaryBuiltin to avoid duplicate logic.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 397254

clang/lib/CodeGen/CGBuiltin.cpp

[Clang] Extend emitUnaryBuiltin to avoid duplicate logic.
ClosedPublic