This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Basic/
-
clang/
-
Basic/
-
Builtins.def
-
lib/
-
CodeGen/
5
CGBuiltin.cpp
-
Sema/
-
SemaChecking.cpp
-
test/
-
CodeGen/
-
builtins-elementwise-math.c
-
Sema/
-
builtins-elementwise-math.c

Differential D116161

[Clang] Extend emitUnaryBuiltin to avoid duplicate logic.
ClosedPublic

Authored by junaire on Dec 22 2021, 4:56 AM.

Download Raw Diff

Details

Reviewers

fhahn
arsenm

Commits

rG5c57e6aa5777: [Clang] Extend emitUnaryBuiltin to avoid duplicate logic.

Summary

This patch extends emitUnaryBuiltin so that we can better emitting IR when
implement builtins specified in D111529.

Also contains some NFC, applying it to existing code.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

junaire requested review of this revision.Dec 22 2021, 4:56 AM

junaire created this revision.

Herald added a project: Restricted Project. · View Herald TranscriptDec 22 2021, 4:56 AM

Herald added subscribers: cfe-commits, wdng. · View Herald Transcript

junaire mentioned this in D115429: [Clang] Implement the rest of __builtin_elementwise_* functions..Dec 22 2021, 5:00 AM

Can you also update the existing places that could use it?

Harbormaster completed remote builds in B140377: Diff 395844.Dec 22 2021, 5:27 AM

Update the existing place that can use emitUnaryBuiltin.

Sorry, It seems that the base branch is wrong, reupdate it.

In D116161#3206442, @junaire wrote:

Update the existing place that can use emitUnaryBuiltin.

I meant just update the existing uses *without* adding floor, roundeven, trunc, so this change should be NFC (non-functional change)

In D116161#3206447, @fhahn wrote:

In D116161#3206442, @junaire wrote:

Update the existing place that can use emitUnaryBuiltin.

I meant just update the existing uses *without* adding floor, roundeven, trunc, so this change should be NFC (non-functional change)

I'm sorry about it, I'm just too careless. :-(

Harbormaster completed remote builds in B140383: Diff 395853.Dec 22 2021, 6:07 AM

Update the existing place that can use emitUnaryBuiltin.

Harbormaster completed remote builds in B140392: Diff 395864.Dec 22 2021, 7:27 AM

Fix wrong usage.

fhahn added inline comments.Dec 24 2021, 12:44 AM

clang/lib/CodeGen/CGBuiltin.cpp
3137–3138	Should also be used here?
3202	Should also be used here?
3221	Should also be used here?

Harbormaster completed remote builds in B140592: Diff 396140.Dec 24 2021, 12:45 AM

Should also be used here?

I don't know why but these will cause tests to fail.
Maybe it is caused by my local cache? I'm gonna make a clean build to test it again.
BTW, thanks for taking look at this. Merry Christmas! ;D

I confirmed that we can use emitUnaryBuiltin in the cases you pointed out. Please see the logs below:

$ path/to/llvm-project/build/bin/clang -cc1 -internal-isystem /path/to/llvm-project/build/lib/clang/14.0.0/include -nostdsysteminc -triple x86_64-apple-darwin /path/to/llvm-project/clang/test/CodeGen/builtins-elementwise-math.c -emit-llvm -disable-llvm-passes -o - | /path/to/llvm-project/build/bin/FileCheck /path/to/llvm-project/clang/test/CodeGen/builtins-elementwise-math.c
/path/to/llvm-project/clang/test/CodeGen/builtins-elementwise-math.c:16:17: error: CHECK-NEXT: expected string not found in input
 // CHECK-NEXT: call float @llvm.fabs.f32(float [[F1]])
                ^
<stdin>:35:43: note: scanning from here
 %0 = load float, float* %f1.addr, align 4
                                          ^
<stdin>:35:43: note: with "F1" equal to "%0"
 %0 = load float, float* %f1.addr, align 4
                                          ^
<stdin>:37:13: note: possible intended match here
 %elt.abs = call float @llvm.fabs.f32(float %1)
            ^

Input file: <stdin>
Check file: /path/to/llvm-project/clang/test/CodeGen/builtins-elementwise-math.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          30:  store <8 x i16> %vi1, <8 x i16>* %vi1.addr, align 16 
          31:  store <8 x i16> %vi2, <8 x i16>* %vi2.addr, align 16 
          32:  store i64 %i1, i64* %i1.addr, align 8 
          33:  store i64 %i2, i64* %i2.addr, align 8 
          34:  store i16 %si, i16* %si.addr, align 2 
          35:  %0 = load float, float* %f1.addr, align 4 
next:16'0                                               X error: no match found
next:16'1                                                 with "F1" equal to "%0"
          36:  %1 = load float, float* %f1.addr, align 4 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          37:  %elt.abs = call float @llvm.fabs.f32(float %1) 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:16'2                 ?                                    possible intended match
          38:  store float %elt.abs, float* %f2.addr, align 4 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          39:  %2 = load double, double* %d1.addr, align 8 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          40:  %3 = load double, double* %d1.addr, align 8 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          41:  %elt.abs1 = call double @llvm.fabs.f64(double %3) 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          42:  store double %elt.abs1, double* %d2.addr, align 8 
next:16'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

$ /path/to/llvm-project/build/bin/clang -cc1 -internal-isystem /path/to/llvm-project/build/lib/clang/14.0.0/include -nostdsysteminc -triple x86_64-apple-darwin ~/dev/cpp-projects/llvm-project/clang/test/CodeGen/builtins-reduction-math.c -emit-llvm -disable-llvm-passes -o - | ~/dev/cpp-projects/llvm-project/build/bin/FileCheck ~/dev/cpp-projects/llvm-project/clang/test/CodeGen/builtins-reduction-math.c
/path/to/llvm-project/clang/test/CodeGen/builtins-reduction-math.c:12:17: error: CHECK-NEXT: expected string not found in input
 // CHECK-NEXT: call float @llvm.vector.reduce.fmax.v4f32(<4 x float> [[VF1]])
                ^
<stdin>:23:57: note: scanning from here
 %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16
                                                        ^
<stdin>:23:57: note: with "VF1" equal to "%0"
 %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16
                                                        ^
<stdin>:25:13: note: possible intended match here
 %rdx.min = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %1)
            ^
/path/to/llvm-project/clang/test/CodeGen/builtins-reduction-math.c:38:17: error: CHECK-NEXT: expected string not found in input
 // CHECK-NEXT: call float @llvm.vector.reduce.fmin.v4f32(<4 x float> [[VF1]])
                ^
<stdin>:74:57: note: scanning from here
 %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16
                                                        ^
<stdin>:74:57: note: with "VF1" equal to "%0"
 %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16
                                                        ^
<stdin>:76:13: note: possible intended match here
 %rdx.min = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %1)
            ^

Input file: <stdin>
Check file: /path/to/llvm-project/clang/test/CodeGen/builtins-reduction-math.c

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          18:  %cvi1 = alloca <8 x i16>, align 16 
          19:  %r5 = alloca i64, align 8 
          20:  store <4 x float> %vf1, <4 x float>* %vf1.addr, align 16 
          21:  store <8 x i16> %vi1, <8 x i16>* %vi1.addr, align 16 
          22:  store <4 x i32> %vu1, <4 x i32>* %vu1.addr, align 16 
          23:  %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16 
next:12'0                                                             X error: no match found
next:12'1                                                               with "VF1" equal to "%0"
          24:  %1 = load <4 x float>, <4 x float>* %vf1.addr, align 16 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          25:  %rdx.min = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %1) 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:12'2                 ?                                                          possible intended match
          26:  store float %rdx.min, float* %r1, align 4 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          27:  %2 = load <8 x i16>, <8 x i16>* %vi1.addr, align 16 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          28:  %3 = load <8 x i16>, <8 x i16>* %vi1.addr, align 16 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          29:  %rdx.min1 = call i16 @llvm.vector.reduce.smax.v8i16(<8 x i16> %3) 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          30:  store i16 %rdx.min1, i16* %r2, align 2 
next:12'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          69:  %cvi1 = alloca <8 x i16>, align 16 
          70:  %r5 = alloca i64, align 8 
          71:  store <4 x float> %vf1, <4 x float>* %vf1.addr, align 16 
          72:  store <8 x i16> %vi1, <8 x i16>* %vi1.addr, align 16 
          73:  store <4 x i32> %vu1, <4 x i32>* %vu1.addr, align 16 
          74:  %0 = load <4 x float>, <4 x float>* %vf1.addr, align 16 
next:38'0                                                             X error: no match found
next:38'1                                                               with "VF1" equal to "%0"
          75:  %1 = load <4 x float>, <4 x float>* %vf1.addr, align 16 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          76:  %rdx.min = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %1) 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:38'2                 ?                                                          possible intended match
          77:  store float %rdx.min, float* %r1, align 4 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          78:  %2 = load <8 x i16>, <8 x i16>* %vi1.addr, align 16 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          79:  %3 = load <8 x i16>, <8 x i16>* %vi1.addr, align 16 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          80:  %rdx.min1 = call i16 @llvm.vector.reduce.smin.v8i16(<8 x i16> %3) 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          81:  store i16 %rdx.min1, i16* %r2, align 2 
next:38'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

Do you know why these will happen? @fhahn

In D116161#3209286, @junaire wrote:

35:  %0 = load float, float* %f1.addr, align 4 
36:  %1 = load float, float* %f1.addr, align 4 
37:  %elt.abs = call float @llvm.fabs.f32(float %1)

It looks like the argument expression is evaluated twice. Did you remove the Value *Op0 = EmitScalarExpr(E->getArg(0)); calls?

In D116161#3209292, @fhahn wrote:
In D116161#3209286, @junaire wrote:
35:  %0 = load float, float* %f1.addr, align 4 
36:  %1 = load float, float* %f1.addr, align 4 
37:  %elt.abs = call float @llvm.fabs.f32(float %1)
It looks like the argument expression is evaluated twice. Did you remove the Value *Op0 = EmitScalarExpr(E->getArg(0)); calls?

Well, for example:
For __builtin_elementwise_abs, we have code like:

case Builtin::BI__builtin_elementwise_abs: {
  Value *Op0 = EmitScalarExpr(E->getArg(0));
  Value *Result;
  if (Op0->getType()->isIntOrIntVectorTy())
    Result = Builder.CreateBinaryIntrinsic(
        llvm::Intrinsic::abs, Op0, Builder.getFalse(), nullptr, "elt.abs");
  else
    Result = Builder.CreateUnaryIntrinsic(llvm::Intrinsic::fabs, Op0, nullptr,
                                          "elt.abs");

  return RValue::get(Result);
}

If we use emitUnaryBuiltin, we are supposed to do something like:

Result = emitUnaryBuiltin(*this, E, llvm::Intrinsic::fabs, "elt.abs");

We need to pass CallExpr* E to the function to meet its interface, and it calls CGF.EmitScalarExpr(E->getArg(0)); inside. Maybe this is what you said about the argument expression being evaluated twice? I think it is unavoidable if we don't change the function's interface.

Maybe we can have something like:

static Value *emitUnaryBuiltin(CodeGenFunction &CGF, Value* Op0,
                               unsigned IntrinsicID, llvm::StringRef Name) {
  return CGF.Builder.CreateUnaryIntrinsic(IntrinsicID, Op0, nullptr, Name);
}

Then for __builtin_elementwise_abs we can have:

Result = emitUnaryBuiltin(*this, Op0, llvm::Intrinsic::fabs, "elt.abs");

and for __builtin_elementwise_ceil have:

return RValue::get(
    emitUnaryBuiltin(*this, EmitScalarExpr(E->getArg(0)), llvm::Intrinsic::ceil, "elt.ceil"));

WDYT? Well, franking speaking I think this one-line function is ugly but I can't come up with a more elegant solution, I would appreciate it if you can offer some suggestions. :)

In order to use emitUnaryBuiltin in other cases, I changed the function interface.
This allows us to use it in all Builder.CreateUnaryIntrinsic() cases, but will make
the function body very small.

Harbormaster completed remote builds in B140779: Diff 396380.Dec 28 2021, 12:59 AM

In D116161#3211178, @junaire wrote:

In order to use emitUnaryBuiltin in other cases, I changed the function interface.
This allows us to use it in all Builder.CreateUnaryIntrinsic() cases, but will make
the function body very small.

I think we should extend & use the existing emitUnaryBuiltin. I think in most cases where it cannot be used straight away you should be able to slightly rewrite the existing code to not rely on llvm::Type. Then there should be no need to call EmitScalarExpr early (an example is in the inline comments)

clang/lib/CodeGen/CGBuiltin.cpp

535

I think we should extend this emitUnaryBuiltin function, rather than having a second one.

e.g.

 static Value *emitUnaryBuiltin(CodeGenFunction &CGF,
                                const CallExpr *E,
-                               unsigned IntrinsicID) {
+                               unsigned IntrinsicID, llvm::StringRef Name = "") {
   llvm::Value *Src0 = CGF.EmitScalarExpr(E->getArg(0));

   Function *F = CGF.CGM.getIntrinsic(IntrinsicID, Src0->getType());
-  return CGF.Builder.CreateCall(F, Src0);
+  return CGF.Builder.CreateCall(F, Src0, Name);
+}

3201–3204

IICU the only reason to call EmitScalarExpr early is the use in GetIntrinsicID, but it could solely rely on QualType:

-    auto GetIntrinsicID = [](QualType QT, llvm::Type *IrTy) {
-      if (IrTy->isIntOrIntVectorTy()) {
-        if (auto *VecTy = QT->getAs<VectorType>())
-          QT = VecTy->getElementType();
-        if (QT->isSignedIntegerType())
-          return llvm::Intrinsic::vector_reduce_smax;
-        else
-          return llvm::Intrinsic::vector_reduce_umax;
-      }
+    auto GetIntrinsicID = [](QualType QT) {
+      if (auto *VecTy = QT->getAs<VectorType>())
+        QT = VecTy->getElementType();
+      if (QT->isSignedIntegerType())
+        return llvm::Intrinsic::vector_reduce_smax;
+      if (QT->isUnsignedIntegerType())
+        return llvm::Intrinsic::vector_reduce_umax;
+      assert(QT->isFloatingType() && "must have a float here");
       return llvm::Intrinsic::vector_reduce_fmax;
     };

Extend emitUnaryBuiltin instead of adding a new overload, also apply it to __builtin_elementwise_abs.

junaire retitled this revision from [Clang] Add an overload for emitUnaryBuiltin. to [Clang] Extend emitUnaryBuiltin to avoid duplicate logic..Dec 28 2021, 11:18 PM

junaire edited the summary of this revision. (Show Details)

Refactor code a little bit and fix wrong names.

Harbormaster completed remote builds in B140865: Diff 396492.Dec 29 2021, 12:27 AM

LGTM, thanks!

This revision is now accepted and ready to land.Jan 2 2022, 1:08 PM

@junaire please let me know if you want me to land this on your behalf.

In D116161#3219080, @fhahn wrote:

@junaire please let me know if you want me to land this on your behalf.

Yeah, thanks a lot!

You can use:
Jun Zhang
jun@junz.org

This revision was landed with ongoing or failed builds.Jan 4 2022, 3:48 AM

Closed by commit rG5c57e6aa5777: [Clang] Extend emitUnaryBuiltin to avoid duplicate logic. (authored by junaire, committed by fhahn). · Explain Why

This revision was automatically updated to reflect the committed changes.

fhahn added a commit: rG5c57e6aa5777: [Clang] Extend emitUnaryBuiltin to avoid duplicate logic..

In D116161#3219149, @junaire wrote:

In D116161#3219080, @fhahn wrote:

@junaire please let me know if you want me to land this on your behalf.

Yeah, thanks a lot!

You can use:
Jun Zhang
jun@junz.org

Committed! After D115429, I'd recommend you obtain commit access https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access

In D116161#3219343, @fhahn wrote:

In D116161#3219149, @junaire wrote:

In D116161#3219080, @fhahn wrote:

@junaire please let me know if you want me to land this on your behalf.

Yeah, thanks a lot!

You can use:
Jun Zhang
jun@junz.org

Committed! After D115429, I'd recommend you obtain commit access https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access

That's so great! Thank you for your continued patient guidance and the review work, I really appreciate!

fhahn added a reverting change: rGf552ba6e8405: Revert "[Clang] Extend emitUnaryBuiltin to avoid duplicate logic.".Jan 4 2022, 5:46 AM

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

Builtins.def

3 lines

lib/

CodeGen/

CGBuiltin.cpp

28 lines

Sema/

SemaChecking.cpp

7 lines

test/

CodeGen/

builtins-elementwise-math.c

48 lines

Sema/

builtins-elementwise-math.c

63 lines

Diff 395852

clang/include/clang/Basic/Builtins.def

	Show First 20 Lines • Show All 641 Lines • ▼ Show 20 Lines
	BUILTIN(__builtin_alloca, "v*z" , "Fn")			BUILTIN(__builtin_alloca, "v*z" , "Fn")
	BUILTIN(__builtin_alloca_with_align, "v*zIz", "Fn")			BUILTIN(__builtin_alloca_with_align, "v*zIz", "Fn")
	BUILTIN(__builtin_call_with_static_chain, "v.", "nt")			BUILTIN(__builtin_call_with_static_chain, "v.", "nt")

	BUILTIN(__builtin_elementwise_abs, "v.", "nct")			BUILTIN(__builtin_elementwise_abs, "v.", "nct")
	BUILTIN(__builtin_elementwise_max, "v.", "nct")			BUILTIN(__builtin_elementwise_max, "v.", "nct")
	BUILTIN(__builtin_elementwise_min, "v.", "nct")			BUILTIN(__builtin_elementwise_min, "v.", "nct")
	BUILTIN(__builtin_elementwise_ceil, "v.", "nct")			BUILTIN(__builtin_elementwise_ceil, "v.", "nct")
				BUILTIN(__builtin_elementwise_floor, "v.", "nct")
				BUILTIN(__builtin_elementwise_roundeven, "v.", "nct")
				BUILTIN(__builtin_elementwise_trunc, "v.", "nct")
	BUILTIN(__builtin_reduce_max, "v.", "nct")			BUILTIN(__builtin_reduce_max, "v.", "nct")
	BUILTIN(__builtin_reduce_min, "v.", "nct")			BUILTIN(__builtin_reduce_min, "v.", "nct")

	BUILTIN(__builtin_matrix_transpose, "v.", "nFt")			BUILTIN(__builtin_matrix_transpose, "v.", "nFt")
	BUILTIN(__builtin_matrix_column_major_load, "v.", "nFt")			BUILTIN(__builtin_matrix_column_major_load, "v.", "nFt")
	BUILTIN(__builtin_matrix_column_major_store, "v.", "nFt")			BUILTIN(__builtin_matrix_column_major_store, "v.", "nFt")

	// "Overloaded" Atomic operator builtins. These are overloaded to support data			// "Overloaded" Atomic operator builtins. These are overloaded to support data
	▲ Show 20 Lines • Show All 1,028 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 526 Lines • ▼ Show 20 Lines	static Value *emitCallMaybeConstrainedFPBuiltin(CodeGenFunction &CGF,
if (CGF.Builder.getIsFPConstrained())		if (CGF.Builder.getIsFPConstrained())
return CGF.Builder.CreateConstrainedFPCall(F, Args);		return CGF.Builder.CreateConstrainedFPCall(F, Args);
else		else
return CGF.Builder.CreateCall(F, Args);		return CGF.Builder.CreateCall(F, Args);
}		}

// Emit a simple mangled intrinsic that has 1 argument and a return type		// Emit a simple mangled intrinsic that has 1 argument and a return type
// matching the argument type.		// matching the argument type.
static Value *emitUnaryBuiltin(CodeGenFunction &CGF,		static Value *emitUnaryBuiltin(CodeGenFunction &CGF,
		fhahnUnsubmitted Not Done Reply Inline Actions I think we should extend this `emitUnaryBuiltin` function, rather than having a second one. e.g. static Value emitUnaryBuiltin(CodeGenFunction &CGF, const CallExpr E, - unsigned IntrinsicID) { + unsigned IntrinsicID, llvm::StringRef Name = "") { llvm::Value Src0 = CGF.EmitScalarExpr(E->getArg(0)); Function F = CGF.CGM.getIntrinsic(IntrinsicID, Src0->getType()); - return CGF.Builder.CreateCall(F, Src0); + return CGF.Builder.CreateCall(F, Src0, Name); +} fhahn: I think we should extend this `emitUnaryBuiltin` function, rather than having a second one. e.
const CallExpr *E,		const CallExpr *E,
unsigned IntrinsicID) {		unsigned IntrinsicID) {
llvm::Value *Src0 = CGF.EmitScalarExpr(E->getArg(0));		llvm::Value *Src0 = CGF.EmitScalarExpr(E->getArg(0));

Function *F = CGF.CGM.getIntrinsic(IntrinsicID, Src0->getType());		Function *F = CGF.CGM.getIntrinsic(IntrinsicID, Src0->getType());
return CGF.Builder.CreateCall(F, Src0);		return CGF.Builder.CreateCall(F, Src0);
}		}

		static Value emitUnaryBuiltin(CodeGenFunction &CGF, const CallExpr E,
		unsigned IntrinsicID, llvm::StringRef Name) {
		llvm::Value *Src0 = CGF.EmitScalarExpr(E->getArg(0));
		return CGF.Builder.CreateUnaryIntrinsic(IntrinsicID, Src0, nullptr, Name);
		}

// Emit an intrinsic that has 2 operands of the same type as its result.		// Emit an intrinsic that has 2 operands of the same type as its result.
static Value *emitBinaryBuiltin(CodeGenFunction &CGF,		static Value *emitBinaryBuiltin(CodeGenFunction &CGF,
const CallExpr *E,		const CallExpr *E,
unsigned IntrinsicID) {		unsigned IntrinsicID) {
llvm::Value *Src0 = CGF.EmitScalarExpr(E->getArg(0));		llvm::Value *Src0 = CGF.EmitScalarExpr(E->getArg(0));
llvm::Value *Src1 = CGF.EmitScalarExpr(E->getArg(1));		llvm::Value *Src1 = CGF.EmitScalarExpr(E->getArg(1));

Function *F = CGF.CGM.getIntrinsic(IntrinsicID, Src0->getType());		Function *F = CGF.CGM.getIntrinsic(IntrinsicID, Src0->getType());
▲ Show 20 Lines • Show All 2,571 Lines • ▼ Show 20 Lines	RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
}		}

case Builtin::BI__builtin_elementwise_abs: {		case Builtin::BI__builtin_elementwise_abs: {
Value *Op0 = EmitScalarExpr(E->getArg(0));		Value *Op0 = EmitScalarExpr(E->getArg(0));
Value *Result;		Value *Result;
if (Op0->getType()->isIntOrIntVectorTy())		if (Op0->getType()->isIntOrIntVectorTy())
Result = Builder.CreateBinaryIntrinsic(		Result = Builder.CreateBinaryIntrinsic(
llvm::Intrinsic::abs, Op0, Builder.getFalse(), nullptr, "elt.abs");		llvm::Intrinsic::abs, Op0, Builder.getFalse(), nullptr, "elt.abs");
else		else
Result = Builder.CreateUnaryIntrinsic(llvm::Intrinsic::fabs, Op0, nullptr,		Result = emitUnaryBuiltin(*this, E, llvm::Intrinsic::fabs, "elt.abs");
		fhahnUnsubmitted Not Done Reply Inline Actions Should also be used here? fhahn: Should also be used here?
"elt.abs");
return RValue::get(Result);
}

case Builtin::BI__builtin_elementwise_ceil: {
Value *Op0 = EmitScalarExpr(E->getArg(0));
Value *Result = Builder.CreateUnaryIntrinsic(llvm::Intrinsic::ceil, Op0,
nullptr, "elt.ceil");
return RValue::get(Result);		return RValue::get(Result);
}		}

		case Builtin::BI__builtin_elementwise_ceil:
		return RValue::get(
		emitUnaryBuiltin(*this, E, llvm::Intrinsic::ceil, "elt.ceil"));
		case Builtin::BI__builtin_elementwise_floor:
		return RValue::get(
		emitUnaryBuiltin(*this, E, llvm::Intrinsic::floor, "elt.floor"));
		case Builtin::BI__builtin_elementwise_roundeven:
		return RValue::get(emitUnaryBuiltin(*this, E, llvm::Intrinsic::roundeven,
		"elt.roundeven"));
		case Builtin::BI__builtin_elementwise_trunc:
		return RValue::get(
		emitUnaryBuiltin(*this, E, llvm::Intrinsic::trunc, "elt.trunc"));

case Builtin::BI__builtin_elementwise_max: {		case Builtin::BI__builtin_elementwise_max: {
Value *Op0 = EmitScalarExpr(E->getArg(0));		Value *Op0 = EmitScalarExpr(E->getArg(0));
Value *Op1 = EmitScalarExpr(E->getArg(1));		Value *Op1 = EmitScalarExpr(E->getArg(1));
Value *Result;		Value *Result;
if (Op0->getType()->isIntOrIntVectorTy()) {		if (Op0->getType()->isIntOrIntVectorTy()) {
QualType Ty = E->getArg(0)->getType();		QualType Ty = E->getArg(0)->getType();
if (auto *VecTy = Ty->getAs<VectorType>())		if (auto *VecTy = Ty->getAs<VectorType>())
Ty = VecTy->getElementType();		Ty = VecTy->getElementType();
Show All 29 Lines	auto GetIntrinsicID = [](QualType QT, llvm::Type *IrTy) {
QT = VecTy->getElementType();		QT = VecTy->getElementType();
if (QT->isSignedIntegerType())		if (QT->isSignedIntegerType())
return llvm::Intrinsic::vector_reduce_smax;		return llvm::Intrinsic::vector_reduce_smax;
else		else
return llvm::Intrinsic::vector_reduce_umax;		return llvm::Intrinsic::vector_reduce_umax;
}		}
return llvm::Intrinsic::vector_reduce_fmax;		return llvm::Intrinsic::vector_reduce_fmax;
};		};
Value *Op0 = EmitScalarExpr(E->getArg(0));		Value *Op0 = EmitScalarExpr(E->getArg(0));
Value *Result = Builder.CreateUnaryIntrinsic(		Value *Result = Builder.CreateUnaryIntrinsic(
		fhahnUnsubmitted Not Done Reply Inline Actions Should also be used here? fhahn: Should also be used here?
GetIntrinsicID(E->getArg(0)->getType(), Op0->getType()), Op0, nullptr,		GetIntrinsicID(E->getArg(0)->getType(), Op0->getType()), Op0, nullptr,
"rdx.min");		"rdx.min");
		fhahnUnsubmitted Not Done Reply Inline Actions IICU the only reason to call `EmitScalarExpr` early is the use in `GetIntrinsicID`, but it could solely rely on `QualType`: - auto GetIntrinsicID = [](QualType QT, llvm::Type IrTy) { - if (IrTy->isIntOrIntVectorTy()) { - if (auto VecTy = QT->getAs<VectorType>()) - QT = VecTy->getElementType(); - if (QT->isSignedIntegerType()) - return llvm::Intrinsic::vector_reduce_smax; - else - return llvm::Intrinsic::vector_reduce_umax; - } + auto GetIntrinsicID = [](QualType QT) { + if (auto VecTy = QT->getAs<VectorType>()) + QT = VecTy->getElementType(); + if (QT->isSignedIntegerType()) + return llvm::Intrinsic::vector_reduce_smax; + if (QT->isUnsignedIntegerType()) + return llvm::Intrinsic::vector_reduce_umax; + assert(QT->isFloatingType() && "must have a float here"); return llvm::Intrinsic::vector_reduce_fmax; }; fhahn:* IICU the only reason to call `EmitScalarExpr` early is the use in `GetIntrinsicID`, but it…
return RValue::get(Result);		return RValue::get(Result);
}		}

case Builtin::BI__builtin_reduce_min: {		case Builtin::BI__builtin_reduce_min: {
auto GetIntrinsicID = [](QualType QT, llvm::Type *IrTy) {		auto GetIntrinsicID = [](QualType QT, llvm::Type *IrTy) {
if (IrTy->isIntOrIntVectorTy()) {		if (IrTy->isIntOrIntVectorTy()) {
if (auto *VecTy = QT->getAs<VectorType>())		if (auto *VecTy = QT->getAs<VectorType>())
QT = VecTy->getElementType();		QT = VecTy->getElementType();
if (QT->isSignedIntegerType())		if (QT->isSignedIntegerType())
return llvm::Intrinsic::vector_reduce_smin;		return llvm::Intrinsic::vector_reduce_smin;
else		else
return llvm::Intrinsic::vector_reduce_umin;		return llvm::Intrinsic::vector_reduce_umin;
}		}
return llvm::Intrinsic::vector_reduce_fmin;		return llvm::Intrinsic::vector_reduce_fmin;
};		};
Value *Op0 = EmitScalarExpr(E->getArg(0));		Value *Op0 = EmitScalarExpr(E->getArg(0));
Value *Result = Builder.CreateUnaryIntrinsic(		Value *Result = Builder.CreateUnaryIntrinsic(
		fhahnUnsubmitted Not Done Reply Inline Actions Should also be used here? fhahn: Should also be used here?
GetIntrinsicID(E->getArg(0)->getType(), Op0->getType()), Op0, nullptr,		GetIntrinsicID(E->getArg(0)->getType(), Op0->getType()), Op0, nullptr,
"rdx.min");		"rdx.min");
return RValue::get(Result);		return RValue::get(Result);
}		}

case Builtin::BI__builtin_matrix_transpose: {		case Builtin::BI__builtin_matrix_transpose: {
const auto *MatrixTy = E->getArg(0)->getType()->getAs<ConstantMatrixType>();		const auto *MatrixTy = E->getArg(0)->getType()->getAs<ConstantMatrixType>();
Value *MatValue = EmitScalarExpr(E->getArg(0));		Value *MatValue = EmitScalarExpr(E->getArg(0));
▲ Show 20 Lines • Show All 15,679 Lines • Show Last 20 Lines

clang/lib/Sema/SemaChecking.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,112 Lines • ▼ Show 20 Lines	if (EltTy->isUnsignedIntegerType()) {
Diag(TheCall->getArg(0)->getBeginLoc(),		Diag(TheCall->getArg(0)->getBeginLoc(),
diag::err_builtin_invalid_arg_type)		diag::err_builtin_invalid_arg_type)
<< 1 << /* signed integer or float ty*/ 3 << ArgTy;		<< 1 << /* signed integer or float ty*/ 3 << ArgTy;
return ExprError();		return ExprError();
}		}
break;		break;
}		}

// __builtin_elementwise_ceil restricts the element type to floating point		// These builtins restrict the element type to floating point
// types only.		// types only.
case Builtin::BI__builtin_elementwise_ceil: {		case Builtin::BI__builtin_elementwise_ceil:
		case Builtin::BI__builtin_elementwise_floor:
		case Builtin::BI__builtin_elementwise_roundeven:
		case Builtin::BI__builtin_elementwise_trunc: {
if (PrepareBuiltinElementwiseMathOneArgCall(TheCall))		if (PrepareBuiltinElementwiseMathOneArgCall(TheCall))
return ExprError();		return ExprError();

QualType ArgTy = TheCall->getArg(0)->getType();		QualType ArgTy = TheCall->getArg(0)->getType();
QualType EltTy = ArgTy;		QualType EltTy = ArgTy;

if (auto *VecTy = EltTy->getAs<VectorType>())		if (auto *VecTy = EltTy->getAs<VectorType>())
EltTy = VecTy->getElementType();		EltTy = VecTy->getElementType();
▲ Show 20 Lines • Show All 14,971 Lines • Show Last 20 Lines

clang/test/CodeGen/builtins-elementwise-math.c

Show First 20 Lines • Show All 199 Lines • ▼ Show 20 Lines	void test_builtin_elementwise_ceil(float f1, float f2, double d1, double d2,
// CHECK: [[D1:%.+]] = load double, double* %d1.addr, align 8		// CHECK: [[D1:%.+]] = load double, double* %d1.addr, align 8
// CHECK-NEXT: call double @llvm.ceil.f64(double [[D1]])		// CHECK-NEXT: call double @llvm.ceil.f64(double [[D1]])
d2 = __builtin_elementwise_ceil(d1);		d2 = __builtin_elementwise_ceil(d1);

// CHECK: [[VF1:%.+]] = load <4 x float>, <4 x float>* %vf1.addr, align 16		// CHECK: [[VF1:%.+]] = load <4 x float>, <4 x float>* %vf1.addr, align 16
// CHECK-NEXT: call <4 x float> @llvm.ceil.v4f32(<4 x float> [[VF1]])		// CHECK-NEXT: call <4 x float> @llvm.ceil.v4f32(<4 x float> [[VF1]])
vf2 = __builtin_elementwise_ceil(vf1);		vf2 = __builtin_elementwise_ceil(vf1);
}		}

		void test_builtin_elementwise_floor(float f1, float f2, double d1, double d2,
		float4 vf1, float4 vf2) {
		// CHECK-LABEL: define void @test_builtin_elementwise_floor(
		// CHECK: [[F1:%.+]] = load float, float* %f1.addr, align 4
		// CHECK-NEXT: call float @llvm.floor.f32(float [[F1]])
		f2 = __builtin_elementwise_floor(f1);

		// CHECK: [[D1:%.+]] = load double, double* %d1.addr, align 8
		// CHECK-NEXT: call double @llvm.floor.f64(double [[D1]])
		d2 = __builtin_elementwise_floor(d1);

		// CHECK: [[VF1:%.+]] = load <4 x float>, <4 x float>* %vf1.addr, align 16
		// CHECK-NEXT: call <4 x float> @llvm.floor.v4f32(<4 x float> [[VF1]])
		vf2 = __builtin_elementwise_floor(vf1);
		}

		void test_builtin_elementwise_roundeven(float f1, float f2, double d1, double d2,
		float4 vf1, float4 vf2) {
		// CHECK-LABEL: define void @test_builtin_elementwise_roundeven(
		// CHECK: [[F1:%.+]] = load float, float* %f1.addr, align 4
		// CHECK-NEXT: call float @llvm.roundeven.f32(float [[F1]])
		f2 = __builtin_elementwise_roundeven(f1);

		// CHECK: [[D1:%.+]] = load double, double* %d1.addr, align 8
		// CHECK-NEXT: call double @llvm.roundeven.f64(double [[D1]])
		d2 = __builtin_elementwise_roundeven(d1);

		// CHECK: [[VF1:%.+]] = load <4 x float>, <4 x float>* %vf1.addr, align 16
		// CHECK-NEXT: call <4 x float> @llvm.roundeven.v4f32(<4 x float> [[VF1]])
		vf2 = __builtin_elementwise_roundeven(vf1);
		}

		void test_builtin_elementwise_trunc(float f1, float f2, double d1, double d2,
		float4 vf1, float4 vf2) {
		// CHECK-LABEL: define void @test_builtin_elementwise_trunc(
		// CHECK: [[F1:%.+]] = load float, float* %f1.addr, align 4
		// CHECK-NEXT: call float @llvm.trunc.f32(float [[F1]])
		f2 = __builtin_elementwise_trunc(f1);

		// CHECK: [[D1:%.+]] = load double, double* %d1.addr, align 8
		// CHECK-NEXT: call double @llvm.trunc.f64(double [[D1]])
		d2 = __builtin_elementwise_trunc(d1);

		// CHECK: [[VF1:%.+]] = load <4 x float>, <4 x float>* %vf1.addr, align 16
		// CHECK-NEXT: call <4 x float> @llvm.trunc.v4f32(<4 x float> [[VF1]])
		vf2 = __builtin_elementwise_trunc(vf1);
		}

clang/test/Sema/builtins-elementwise-math.c

Show First 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	void test_builtin_elementwise_ceil(int i, float f, double d, float4 v, int3 iv, unsigned u, unsigned4 uv) {
// expected-error@-1 {{too many arguments to function call, expected 1, have 2}}		// expected-error@-1 {{too many arguments to function call, expected 1, have 2}}

u = __builtin_elementwise_ceil(u);		u = __builtin_elementwise_ceil(u);
// expected-error@-1 {{1st argument must be a floating point type (was 'unsigned int')}}		// expected-error@-1 {{1st argument must be a floating point type (was 'unsigned int')}}

uv = __builtin_elementwise_ceil(uv);		uv = __builtin_elementwise_ceil(uv);
// expected-error@-1 {{1st argument must be a floating point type (was 'unsigned4' (vector of 4 'unsigned int' values))}}		// expected-error@-1 {{1st argument must be a floating point type (was 'unsigned4' (vector of 4 'unsigned int' values))}}
}		}

		void test_builtin_elementwise_floor(int i, float f, double d, float4 v, int3 iv, unsigned u, unsigned4 uv) {

		struct Foo s = __builtin_elementwise_floor(f);
		// expected-error@-1 {{initializing 'struct Foo' with an expression of incompatible type 'float'}}

		i = __builtin_elementwise_floor();
		// expected-error@-1 {{too few arguments to function call, expected 1, have 0}}

		i = __builtin_elementwise_floor(i);
		// expected-error@-1 {{1st argument must be a floating point type (was 'int')}}

		i = __builtin_elementwise_floor(f, f);
		// expected-error@-1 {{too many arguments to function call, expected 1, have 2}}

		u = __builtin_elementwise_floor(u);
		// expected-error@-1 {{1st argument must be a floating point type (was 'unsigned int')}}

		uv = __builtin_elementwise_floor(uv);
		// expected-error@-1 {{1st argument must be a floating point type (was 'unsigned4' (vector of 4 'unsigned int' values))}}
		}

		void test_builtin_elementwise_roundeven(int i, float f, double d, float4 v, int3 iv, unsigned u, unsigned4 uv) {

		struct Foo s = __builtin_elementwise_roundeven(f);
		// expected-error@-1 {{initializing 'struct Foo' with an expression of incompatible type 'float'}}

		i = __builtin_elementwise_roundeven();
		// expected-error@-1 {{too few arguments to function call, expected 1, have 0}}

		i = __builtin_elementwise_roundeven(i);
		// expected-error@-1 {{1st argument must be a floating point type (was 'int')}}

		i = __builtin_elementwise_roundeven(f, f);
		// expected-error@-1 {{too many arguments to function call, expected 1, have 2}}

		u = __builtin_elementwise_roundeven(u);
		// expected-error@-1 {{1st argument must be a floating point type (was 'unsigned int')}}

		uv = __builtin_elementwise_roundeven(uv);
		// expected-error@-1 {{1st argument must be a floating point type (was 'unsigned4' (vector of 4 'unsigned int' values))}}
		}

		void test_builtin_elementwise_trunc(int i, float f, double d, float4 v, int3 iv, unsigned u, unsigned4 uv) {

		struct Foo s = __builtin_elementwise_trunc(f);
		// expected-error@-1 {{initializing 'struct Foo' with an expression of incompatible type 'float'}}

		i = __builtin_elementwise_trunc();
		// expected-error@-1 {{too few arguments to function call, expected 1, have 0}}

		i = __builtin_elementwise_trunc(i);
		// expected-error@-1 {{1st argument must be a floating point type (was 'int')}}

		i = __builtin_elementwise_trunc(f, f);
		// expected-error@-1 {{too many arguments to function call, expected 1, have 2}}

		u = __builtin_elementwise_trunc(u);
		// expected-error@-1 {{1st argument must be a floating point type (was 'unsigned int')}}

		uv = __builtin_elementwise_trunc(uv);
		// expected-error@-1 {{1st argument must be a floating point type (was 'unsigned4' (vector of 4 'unsigned int' values))}}
		}

This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Extend emitUnaryBuiltin to avoid duplicate logic.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 395852

clang/include/clang/Basic/Builtins.def

clang/lib/CodeGen/CGBuiltin.cpp

clang/lib/Sema/SemaChecking.cpp

clang/test/CodeGen/builtins-elementwise-math.c

clang/test/Sema/builtins-elementwise-math.c

[Clang] Extend emitUnaryBuiltin to avoid duplicate logic.
ClosedPublic