This is an archive of the discontinued LLVM Phabricator instance.

[Builtins] Implement __builtin_clrsb to be compatible with gcc
ClosedPublic

Authored by craig.topper on Aug 1 2018, 6:25 PM.

Download Raw Diff

Details

Reviewers

bkramer
efriedma
spatel
javed.absar

Commits

rG0a4f6be4434c: [Builtins] Implement __builtin_clrsb to be compatible with gcc
rL339282: [Builtins] Implement __builtin_clrsb to be compatible with gcc
rC339282: [Builtins] Implement __builtin_clrsb to be compatible with gcc

Summary

gcc defines an intrinsic called __builtin_clrsb which counts the number of extra sign bits on a number. This is equivalent to counting the number of leading zeros on a positive number or the number of leading ones on a negative number and subtracting one from the result. Since we can't count leading ones we need to invert negative numbers to count zeros.

The emitted sequence contains a bit of trickery stolen from an LLVM AArch64 test arm64-clrsb.ll to prevent passing a value of 0 to ctlz. I used a icmp slt and a select to conditionally negate, but InstCombine will turn that into an ashr+xor. I can emit that directly if that's prefered. I know @spatel has been trying to remove some of the bit tricks from InstCombine so I'm not sure if the ashr+xor form will be canonical going forward.

This patch will cause the builtin to be expanded inline while gcc uses a call to a function like clrsbdi2 that is implemented in libgcc. But this is similar to what we already do for popcnt. And I don't think compiler-rt supports clrsbdi2.

Diff Detail

Repository: rC Clang

Event Timeline

craig.topper created this revision.Aug 1 2018, 6:25 PM

Herald added a reviewer: javed.absar. · View Herald TranscriptAug 1 2018, 6:25 PM

Herald added a subscriber: kristof.beyls. · View Herald Transcript

Ping

Test case?

lib/CodeGen/CGBuiltin.cpp
1563	CreateIntCast just does nothing if the types match, so this check isn't needed.

Add the test case that I failed to pick up in the original diff.

About the bit hacking: I don't think clang should be in the optimization business. We should be able to take the most obvious/simple representation for this builtin and reduce it as needed (either in instcombine or the backend). So it would be better to use the version with the subtract rather than shift/or. That version corresponds more directly to the code in APInt::getMinSignedBits()?

include/clang/Basic/Builtins.def
416	Is is intentional that clang doesn't document the behavior of the builtins that it copies from gcc? I'd think logic descriptions for all of these would be handy, but especially for less common ones like 'clrsb'.

Use ctlz(zero_undef=false) and sub

This revision is now accepted and ready to land.Aug 8 2018, 11:18 AM

Closed by commit rC339282: [Builtins] Implement __builtin_clrsb to be compatible with gcc (authored by ctopper). · Explain WhyAug 8 2018, 12:56 PM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: kristina. · View Herald TranscriptAug 8 2018, 12:56 PM

tianqing added a subscriber: tianqing.Jan 22 2019, 6:46 PM

Revision Contents

Path

Size

include/

clang/

Basic/

Builtins.def

3 lines

lib/

CodeGen/

CGBuiltin.cpp

20 lines

test/

CodeGen/

builtin_clrsb.c

22 lines

Diff 159769

include/clang/Basic/Builtins.def

	Show First 20 Lines • Show All 407 Lines • ▼ Show 20 Lines
	BUILTIN(__builtin_ffsl , "iLi" , "Fnc")			BUILTIN(__builtin_ffsl , "iLi" , "Fnc")
	BUILTIN(__builtin_ffsll, "iLLi", "Fnc")			BUILTIN(__builtin_ffsll, "iLLi", "Fnc")
	BUILTIN(__builtin_parity , "iUi" , "nc")			BUILTIN(__builtin_parity , "iUi" , "nc")
	BUILTIN(__builtin_parityl , "iULi" , "nc")			BUILTIN(__builtin_parityl , "iULi" , "nc")
	BUILTIN(__builtin_parityll, "iULLi", "nc")			BUILTIN(__builtin_parityll, "iULLi", "nc")
	BUILTIN(__builtin_popcount , "iUi" , "nc")			BUILTIN(__builtin_popcount , "iUi" , "nc")
	BUILTIN(__builtin_popcountl , "iULi" , "nc")			BUILTIN(__builtin_popcountl , "iULi" , "nc")
	BUILTIN(__builtin_popcountll, "iULLi", "nc")			BUILTIN(__builtin_popcountll, "iULLi", "nc")
				BUILTIN(__builtin_clrsb , "ii" , "nc")
				spatelUnsubmitted Not Done Reply Inline Actions Is is intentional that clang doesn't document the behavior of the builtins that it copies from gcc? I'd think logic descriptions for all of these would be handy, but especially for less common ones like 'clrsb'. spatel: Is is intentional that clang doesn't document the behavior of the builtins that it copies from…
				BUILTIN(__builtin_clrsbl , "iLi" , "nc")
				BUILTIN(__builtin_clrsbll, "iLLi", "nc")

	// FIXME: These type signatures are not correct for targets with int != 32-bits			// FIXME: These type signatures are not correct for targets with int != 32-bits
	// or with ULL != 64-bits.			// or with ULL != 64-bits.
	BUILTIN(__builtin_bswap16, "UsUs", "nc")			BUILTIN(__builtin_bswap16, "UsUs", "nc")
	BUILTIN(__builtin_bswap32, "UiUi", "nc")			BUILTIN(__builtin_bswap32, "UiUi", "nc")
	BUILTIN(__builtin_bswap64, "ULLiULLi", "nc")			BUILTIN(__builtin_bswap64, "ULLiULLi", "nc")

	BUILTIN(__builtin_bitreverse8, "UcUc", "nc")			BUILTIN(__builtin_bitreverse8, "UcUc", "nc")
	▲ Show 20 Lines • Show All 1,084 Lines • Show Last 20 Lines

lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,531 Lines • ▼ Show 20 Lines	RValue CodeGenFunction::EmitBuiltinExpr(const FunctionDecl *FD,
case Builtin::BI__builtin_cimagl:		case Builtin::BI__builtin_cimagl:
case Builtin::BIcimag:		case Builtin::BIcimag:
case Builtin::BIcimagf:		case Builtin::BIcimagf:
case Builtin::BIcimagl: {		case Builtin::BIcimagl: {
ComplexPairTy ComplexVal = EmitComplexExpr(E->getArg(0));		ComplexPairTy ComplexVal = EmitComplexExpr(E->getArg(0));
return RValue::get(ComplexVal.second);		return RValue::get(ComplexVal.second);
}		}

		case Builtin::BI__builtin_clrsb:
		case Builtin::BI__builtin_clrsbl:
		case Builtin::BI__builtin_clrsbll: {
		// clrsb(x) -> clz(x < 0 ? ~x : x) - 1 or
		Value *ArgValue = EmitScalarExpr(E->getArg(0));

		llvm::Type *ArgType = ArgValue->getType();
		Value *F = CGM.getIntrinsic(Intrinsic::ctlz, ArgType);

		llvm::Type *ResultType = ConvertType(E->getType());
		Value *Zero = llvm::Constant::getNullValue(ArgType);
		Value *IsNeg = Builder.CreateICmpSLT(ArgValue, Zero, "isneg");
		Value *Inverse = Builder.CreateNot(ArgValue, "not");
		Value *Tmp = Builder.CreateSelect(IsNeg, Inverse, ArgValue);
		Value *Ctlz = Builder.CreateCall(F, {Tmp, Builder.getFalse()});
		Value *Result = Builder.CreateSub(Ctlz, llvm::ConstantInt::get(ArgType, 1));
		Result = Builder.CreateIntCast(Result, ResultType, /isSigned/true,
		"cast");
		return RValue::get(Result);
		}
case Builtin::BI__builtin_ctzs:		case Builtin::BI__builtin_ctzs:
case Builtin::BI__builtin_ctz:		case Builtin::BI__builtin_ctz:
case Builtin::BI__builtin_ctzl:		case Builtin::BI__builtin_ctzl:
case Builtin::BI__builtin_ctzll: {		case Builtin::BI__builtin_ctzll: {
		bkramerUnsubmitted Not Done Reply Inline Actions CreateIntCast just does nothing if the types match, so this check isn't needed. bkramer: CreateIntCast just does nothing if the types match, so this check isn't needed.
Value *ArgValue = EmitCheckedArgForBuiltin(E->getArg(0), BCK_CTZPassedZero);		Value *ArgValue = EmitCheckedArgForBuiltin(E->getArg(0), BCK_CTZPassedZero);

llvm::Type *ArgType = ArgValue->getType();		llvm::Type *ArgType = ArgValue->getType();
Value *F = CGM.getIntrinsic(Intrinsic::cttz, ArgType);		Value *F = CGM.getIntrinsic(Intrinsic::cttz, ArgType);

llvm::Type *ResultType = ConvertType(E->getType());		llvm::Type *ResultType = ConvertType(E->getType());
Value *ZeroUndef = Builder.getInt1(getTarget().isCLZForZeroUndef());		Value *ZeroUndef = Builder.getInt1(getTarget().isCLZForZeroUndef());
Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef});		Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef});
▲ Show 20 Lines • Show All 10,745 Lines • Show Last 20 Lines

test/CodeGen/builtin_clrsb.c

Property	Old Value	New Value
svn:eol-style	null	native \ No newline at end of property
svn:keywords	null	"Author Date Id Rev URL" \ No newline at end of property
svn:mime-type	null	text/plain \ No newline at end of property

				// RUN: %clang_cc1 %s -emit-llvm -o - \| FileCheck %s

				int test__builtin_clrsb(int x) {
				// CHECK-LABEL: test__builtin_clrsb
				// CHECK: [[C:%.]] = icmp slt i32 [[X:%.]], 0
				// CHECK-NEXT: [[INV:%.*]] = xor i32 [[X]], -1
				// CHECK-NEXT: [[SEL:%.*]] = select i1 [[C]], i32 [[INV]], i32 [[X]]
				// CHECK-NEXT: [[CTLZ:%.*]] = call i32 @llvm.ctlz.i32(i32 [[SEL]], i1 false)
				// CHECK-NEXT: [[SUB:%.*]] = sub i32 [[CTLZ]], 1
				return __builtin_clrsb(x);
				}

				int test__builtin_clrsbll(long long x) {
				// CHECK-LABEL: test__builtin_clrsbll
				// CHECK: [[C:%.]] = icmp slt i64 [[X:%.]], 0
				// CHECK-NEXT: [[INV:%.*]] = xor i64 [[X]], -1
				// CHECK-NEXT: [[SEL:%.*]] = select i1 [[C]], i64 [[INV]], i64 [[X]]
				// CHECK-NEXT: [[CTLZ:%.*]] = call i64 @llvm.ctlz.i64(i64 [[SEL]], i1 false)
				// CHECK-NEXT: [[SUB:%.*]] = sub i64 [[CTLZ]], 1
				// CHECK-NEXT: trunc i64 [[SUB]] to i32
				return __builtin_clrsbll(x);
				}