This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineCalls.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
ctpop.ll
-
cttz.ll

Differential D149077

[InstCombine] Add !noundef if is guaranteed do not violate !range
Needs RevisionPublic

Authored by StephenFan on Apr 24 2023, 9:54 AM.

Download Raw Diff

Details

Reviewers

nikic

Summary

Since D141386, violating !range would return poison. But folding select
to and/or i1 isn't poison safe in general. If we can prove that the
!range can never be violated, adding a !noundef can assume it is
guaranteed not to be posion value.

Even though violating !noundef would cause immediate undefine behavior,
in other words this instruction would not be speculatable, but metadatas
that may raise IUB is not considered in isSafeToSpeculativelyExecute
since these metadatas can be dropped.

Therefore, adding !noundef can improve folding of select and there is no
foreseeable side-effect at the same time.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

StephenFan created this revision.Apr 24 2023, 9:54 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 24 2023, 9:54 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

StephenFan requested review of this revision.Apr 24 2023, 9:54 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 24 2023, 9:54 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B227760: Diff 516447.Apr 24 2023, 10:31 AM

If the instruction self can create poison or undef, it is not guaranteed do not violate the !range metadata

Harbormaster completed remote builds in B227887: Diff 516625.Apr 24 2023, 11:02 PM

Can you please explain what the larger motivation for this is? Can you show the whole pattern you're trying to optimize? I really don't want to do this without strong motivation.

Also, the !noundef metadata should be a noundef attribute. Actually, this should be enforced by the verifier the same way it is for !nonull and !align, it looks like we're missing a check.

In D149077#4294864, @nikic wrote:

Can you please explain what the larger motivation for this is? Can you show the whole pattern you're trying to optimize? I really don't want to do this without strong motivation.

Also, the !noundef metadata should be a noundef attribute. Actually, this should be enforced by the verifier the same way it is for !nonull and !align, it looks like we're missing a check.

The larger motivation is we can't instcombine %r = select i1 %b, i1 %cmp, i1 false to %r = and i1 %b, %cmp in the following example. And it caused code size regressions on lzbench https://github.com/dtcxzyw/llvm-ci/issues/135 .

define i1 @poison_metadata(i32 noundef %a, i1 %b) {
  %c = call i32 @llvm.ctpop.i32(i32 %a), !range !0
  %cmp = icmp ugt i32 %c, 2
  %r = select i1 %b, i1 %cmp, i1 false
  ret i1 %r
}

!0 = !{i32 0, i32 4}
!1 = !{}
declare i32 @llvm.ctpop.i32(i32)

I think the right way to fix that would be to support poison flag/metadata drop in impliesPoison() transform. Even if we can't prove that the metadata does not produce poison, we can still drop it.

In D149077#4295353, @nikic wrote:

I think the right way to fix that would be to support poison flag/metadata drop in impliesPoison() transform. Even if we can't prove that the metadata does not produce poison, we can still drop it.

To expand on this: The general policy is that presence of poison flags/metadata should not block transforms, instead we should drop the flags/metadata to allow the transform. For example, when we push freeze into operands we call canCreateUndefOrPoison with ConsiderFlagsAndMetadata=false and drop those flags. The impliesPoison() based select -> and/or transform currently doesn't do this, so we may fail the transform in cases where it would succeed if we dropped flags. Supporting this will need some API changes, in particular impliesPoison() will have to collect instructions on which flags need to be dropped.

I believe this is the proper way to fix this, which will also address other uses of flags/metadata.

This revision now requires changes to proceed.Apr 26 2023, 1:09 AM

In D149077#4298207, @nikic wrote:

In D149077#4295353, @nikic wrote:

I think the right way to fix that would be to support poison flag/metadata drop in impliesPoison() transform. Even if we can't prove that the metadata does not produce poison, we can still drop it.

To expand on this: The general policy is that presence of poison flags/metadata should not block transforms, instead we should drop the flags/metadata to allow the transform. For example, when we push freeze into operands we call canCreateUndefOrPoison with ConsiderFlagsAndMetadata=false and drop those flags. The impliesPoison() based select -> and/or transform currently doesn't do this, so we may fail the transform in cases where it would succeed if we dropped flags. Supporting this will need some API changes, in particular impliesPoison() will have to collect instructions on which flags need to be dropped.

I believe this is the proper way to fix this, which will also address other uses of flags/metadata.

Thanks for the detailed explanation! @nikic

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineCalls.cpp

26 lines

test/

Transforms/

InstCombine/

ctpop.ll

9 lines

cttz.ll

22 lines

Diff 516625

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

Show First 20 Lines • Show All 499 Lines • ▼ Show 20 Lines	if (Result->getType()->getPointerAddressSpace() !=
II.getType()->getPointerAddressSpace())		II.getType()->getPointerAddressSpace())
Result = IC.Builder.CreateAddrSpaceCast(Result, II.getType());		Result = IC.Builder.CreateAddrSpaceCast(Result, II.getType());
if (Result->getType() != II.getType())		if (Result->getType() != II.getType())
Result = IC.Builder.CreateBitCast(Result, II.getType());		Result = IC.Builder.CreateBitCast(Result, II.getType());

return cast<Instruction>(Result);		return cast<Instruction>(Result);
}		}

		static bool isGuaranteedDoNotViolateRangeMetadata(Instruction *I,
		unsigned KnownMin,
		unsigned KnownMax) {
		if (MDNode *Node = I->getMetadata(LLVMContext::MD_range))
		if (mdconst::extract<ConstantInt>(Node->getOperand(0))
		->equalsInt(KnownMin) &&
		mdconst::extract<ConstantInt>(Node->getOperand(1))->equalsInt(KnownMax))
		if (!canCreateUndefOrPoison(cast<Operator>(I),
		/ConsiderFlagsAndMetadata/ false) &&
		all_of(I->operands(),
		[&](Value *V) { return isGuaranteedNotToBeUndefOrPoison(V); }))
		return true;
		return false;
		}

static Instruction *foldCttzCtlz(IntrinsicInst &II, InstCombinerImpl &IC) {		static Instruction *foldCttzCtlz(IntrinsicInst &II, InstCombinerImpl &IC) {
assert((II.getIntrinsicID() == Intrinsic::cttz \|\|		assert((II.getIntrinsicID() == Intrinsic::cttz \|\|
II.getIntrinsicID() == Intrinsic::ctlz) &&		II.getIntrinsicID() == Intrinsic::ctlz) &&
"Expected cttz or ctlz intrinsic");		"Expected cttz or ctlz intrinsic");
bool IsTZ = II.getIntrinsicID() == Intrinsic::cttz;		bool IsTZ = II.getIntrinsicID() == Intrinsic::cttz;
Value *Op0 = II.getArgOperand(0);		Value *Op0 = II.getArgOperand(0);
Value *Op1 = II.getArgOperand(1);		Value *Op1 = II.getArgOperand(1);
Value *X;		Value *X;
▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines	if (IT && IT->getBitWidth() != 1 && !II.getMetadata(LLVMContext::MD_range)) {
Metadata *LowAndHigh[] = {		Metadata *LowAndHigh[] = {
ConstantAsMetadata::get(ConstantInt::get(IT, DefiniteZeros)),		ConstantAsMetadata::get(ConstantInt::get(IT, DefiniteZeros)),
ConstantAsMetadata::get(ConstantInt::get(IT, PossibleZeros + 1))};		ConstantAsMetadata::get(ConstantInt::get(IT, PossibleZeros + 1))};
II.setMetadata(LLVMContext::MD_range,		II.setMetadata(LLVMContext::MD_range,
MDNode::get(II.getContext(), LowAndHigh));		MDNode::get(II.getContext(), LowAndHigh));
return &II;		return &II;
}		}

		if (!II.getMetadata(LLVMContext::MD_noundef) &&
		isGuaranteedDoNotViolateRangeMetadata(&II, DefiniteZeros,
		PossibleZeros + 1))
		II.setMetadata(LLVMContext::MD_noundef,
		MDNode::get(II.getContext(), nullptr));

return nullptr;		return nullptr;
}		}

static Instruction *foldCtpop(IntrinsicInst &II, InstCombinerImpl &IC) {		static Instruction *foldCtpop(IntrinsicInst &II, InstCombinerImpl &IC) {
assert(II.getIntrinsicID() == Intrinsic::ctpop &&		assert(II.getIntrinsicID() == Intrinsic::ctpop &&
"Expected ctpop intrinsic");		"Expected ctpop intrinsic");
Type *Ty = II.getType();		Type *Ty = II.getType();
unsigned BitWidth = Ty->getScalarSizeInBits();		unsigned BitWidth = Ty->getScalarSizeInBits();
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	if (IT->getBitWidth() != 1 && !II.getMetadata(LLVMContext::MD_range)) {
Metadata *LowAndHigh[] = {		Metadata *LowAndHigh[] = {
ConstantAsMetadata::get(ConstantInt::get(IT, MinCount)),		ConstantAsMetadata::get(ConstantInt::get(IT, MinCount)),
ConstantAsMetadata::get(ConstantInt::get(IT, MaxCount + 1))};		ConstantAsMetadata::get(ConstantInt::get(IT, MaxCount + 1))};
II.setMetadata(LLVMContext::MD_range,		II.setMetadata(LLVMContext::MD_range,
MDNode::get(II.getContext(), LowAndHigh));		MDNode::get(II.getContext(), LowAndHigh));
return &II;		return &II;
}		}

		if (!II.getMetadata(LLVMContext::MD_noundef) &&
		isGuaranteedDoNotViolateRangeMetadata(&II, MinCount, MaxCount + 1))
		II.setMetadata(LLVMContext::MD_noundef,
		MDNode::get(II.getContext(), nullptr));

return nullptr;		return nullptr;
}		}

/// Convert a table lookup to shufflevector if the mask is constant.		/// Convert a table lookup to shufflevector if the mask is constant.
/// This could benefit tbl1 if the mask is { 7,6,5,4,3,2,1,0 }, in		/// This could benefit tbl1 if the mask is { 7,6,5,4,3,2,1,0 }, in
/// which case we could lower the shufflevector with rev64 instructions		/// which case we could lower the shufflevector with rev64 instructions
/// as it's actually a byte reverse.		/// as it's actually a byte reverse.
static Value *simplifyNeonTbl1(const IntrinsicInst &II,		static Value *simplifyNeonTbl1(const IntrinsicInst &II,
▲ Show 20 Lines • Show All 3,255 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/ctpop.ll

Show First 20 Lines • Show All 469 Lines • ▼ Show 20 Lines	;
%i = tail call i32 @llvm.ctpop.i32(i32 %arg1)		%i = tail call i32 @llvm.ctpop.i32(i32 %arg1)
%i2 = and i32 %i, 1		%i2 = and i32 %i, 1
tail call void @use(i32 %i2)		tail call void @use(i32 %i2)
%i3 = tail call i32 @llvm.ctpop.i32(i32 %arg)		%i3 = tail call i32 @llvm.ctpop.i32(i32 %arg)
%i4 = and i32 %i3, 1		%i4 = and i32 %i3, 1
%i5 = xor i32 %i2, %i4		%i5 = xor i32 %i2, %i4
ret i32 %i5		ret i32 %i5
}		}

		define i8 @arg_noundef(i8 noundef %arg) {
		; CHECK-LABEL: @arg_noundef(
		; CHECK-NEXT: [[CNT:%.]] = call i8 @llvm.ctpop.i8(i8 [[ARG:%.]]), !range [[RNG0]], !noundef !6
		; CHECK-NEXT: ret i8 [[CNT]]
		;
		%cnt = call i8 @llvm.ctpop.i8(i8 %arg)
		ret i8 %cnt
		}

llvm/test/Transforms/InstCombine/cttz.ll

	Show First 20 Lines • Show All 112 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i32> [[X:%.]] to <2 x i64>			; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i32> [[X:%.]] to <2 x i64>
	; CHECK-NEXT: [[TZ:%.*]] = call <2 x i64> @llvm.cttz.v2i64(<2 x i64> [[TMP1]], i1 false), !range [[RNG2]]			; CHECK-NEXT: [[TZ:%.*]] = call <2 x i64> @llvm.cttz.v2i64(<2 x i64> [[TMP1]], i1 false), !range [[RNG2]]
	; CHECK-NEXT: ret <2 x i64> [[TZ]]			; CHECK-NEXT: ret <2 x i64> [[TZ]]
	;			;
	%s = sext <2 x i32> %x to <2 x i64>			%s = sext <2 x i32> %x to <2 x i64>
	%tz = tail call <2 x i64> @llvm.cttz.v2i64(<2 x i64> %s, i1 false)			%tz = tail call <2 x i64> @llvm.cttz.v2i64(<2 x i64> %s, i1 false)
	ret <2 x i64> %tz			ret <2 x i64> %tz
	}			}

				define i32 @cttz_arg_noundef_zero_poison(i16 noundef %x) {
				; CHECK-LABEL: @cttz_arg_noundef_zero_poison(
				; CHECK-NEXT: [[TMP1:%.]] = call i16 @llvm.cttz.i16(i16 [[X:%.]], i1 true), !range [[RNG0]]
				; CHECK-NEXT: [[TZ:%.*]] = zext i16 [[TMP1]] to i32
				; CHECK-NEXT: ret i32 [[TZ]]
				;
				%z = zext i16 %x to i32
				%tz = call i32 @llvm.cttz.i32(i32 %z, i1 true)
				ret i32 %tz
				}

				define i32 @cttz_arg_noundef(i16 noundef %x) {
				; CHECK-LABEL: @cttz_arg_noundef(
				; CHECK-NEXT: [[Z:%.]] = zext i16 [[X:%.]] to i32
				; CHECK-NEXT: [[TZ:%.*]] = call i32 @llvm.cttz.i32(i32 [[Z]], i1 false), !range [[RNG1]], !noundef !3
				; CHECK-NEXT: ret i32 [[TZ]]
				;
				%z = zext i16 %x to i32
				%tz = call i32 @llvm.cttz.i32(i32 %z, i1 false)
				ret i32 %tz
				}