Download Raw Diff

Details

Reviewers

spatel
RKSimon
craig.topper
kpn
efriedma

Commits

rGb1b7fb6f20b0: [InstCombine] trunc (fptoui|fptosi)

Summary

Attempt to fold the trunc into the fp-to-int conversion.

https://alive2.llvm.org/ce/z/8RCNou

Diff Detail

Event Timeline

samparker created this revision.Jan 19 2023, 2:03 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 19 2023, 2:03 AM

Herald added subscribers: StephenFan, hiraditya. · View Herald Transcript

samparker requested review of this revision.Jan 19 2023, 2:03 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 19 2023, 2:03 AM

Now only checking for poison/undef for the signed case.

samparker added inline comments.Jan 19 2023, 3:33 AM

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
506	And I'm still not sure whether this is needed? Alive seems to want it to be happy, but I pretty sure integer transforms are performed elsewhere without considering fp-to-int conversions as inputs.

Harbormaster completed remote builds in B208704: Diff 490438.Jan 19 2023, 4:07 AM

I realize it's unlikely in practice, but is there a reason not to support any FP type? For fptoui, the integer type just needs one more bit than the max exponent for a given FP semantic?

For the tests, it should be sufficient to have the intermediate integer width be one more than the minimum required type width, so "%i = fptoui half %x to i17".

Please pre-commit baseline tests (either locally or push to main) and label tests that should not change as negative tests (either in the function name or with a code comment).

it should be sufficient to have the intermediate integer width be one more than the minimum required type width, so "%i = fptoui half %x to i17".

IIUC, for half fptoui we don't need an i17, as an i16 can hold the max normal value (65504). I can add support in for float conversions though, as this logic is only triggering for simple types, I assume the only conversion that will work is float -> i128 -> i64.

I would really appreciate if someone could help me understand the complication with fptosi w.r.t checking for poison/undef too.

nikic added a subscriber: nikic.Jan 20 2023, 3:31 AM

nikic added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
506	Can you share the problematic proof? It shouldn't be needed.

samparker added inline comments.Jan 20 2023, 4:14 AM

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
506	It doesn't make sense to me, but I'm hopeless with FP, so just alive... https://alive2.llvm.org/ce/z/fr5kdx. It will only compile with `--disable-undef-input` or a noundef operand.

In D142093#4068247, @samparker wrote:

it should be sufficient to have the intermediate integer width be one more than the minimum required type width, so "%i = fptoui half %x to i17".

IIUC, for half fptoui we don't need an i17, as an i16 can hold the max normal value (65504). I can add support in for float conversions though, as this logic is only triggering for simple types, I assume the only conversion that will work is float -> i128 -> i64.

i16 is the smallest final type for fptoui; i17 is one bit bigger because we need to truncate at least one bit. We don't really want a "simple type" limit here in IR either unless there's some codegen concern. I'd add tests with float and bfloat. There's a current discussion about adding various other small format FP types to IR, so we should try to future-proof this transform for those types in case they make it into IR.

I would really appreciate if someone could help me understand the complication with fptosi w.r.t checking for poison/undef too.

There is no undef problem - I think it's just that the online instance times out with larger widths:
https://alive2.llvm.org/ce/z/6EZXLQ

Okay, great. Thanks for clarification on both fronts. I'm just about to commit some tests.

If you want to add some more, the initial set is in: https://github.com/llvm/llvm-project/commit/714286f9e641209411609deaf80dd865aa2198c5

Removed non-poison input restriction for fptosi.

Harbormaster completed remote builds in B208986: Diff 490840.Jan 20 2023, 8:31 AM

efriedma added inline comments.Jan 20 2023, 11:00 AM

llvm/test/Transforms/InstCombine/trunc-fp-to-int.ll
357	From alive2: define i33 @src(float noundef %x) { %0: %conv = fptosi float noundef %x to i64 %conv.1 = trunc i64 %conv to i33 ret i33 %conv.1 } => define i33 @tgt(float noundef %x) { %0: %conv = fptosi float noundef %x to i33 ret i33 %conv } Transformation doesn't verify! ERROR: Target is more poisonous than source The final result type must be able to hold the largest finite number representable in the floating-point type; otherwise, the transform isn't legal. For float, that's an i128 or i129, I think? AArch64 does have an instruction FJCVTZS you could theoretically use for this kind of thing, but that seems unlikely to be worthwhile.

spatel mentioned this in rGcb29ba9c0f87: [InstCombine] adjust tests for fptoui + trunc; NFC.Jan 20 2023, 11:25 AM

I updated the test file; see if this covers everything for fptoui:
cb29ba9c0f87
(if yes, then we duplicate each test for fptoui with one extra bit for the integer types)

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
503–506	This isn't correct - we want to use semanticsMaxExponent() or something like that to determine the minimum bitwidth. Signed cast needs one extra bit to not truncate the sign bit in integer form.
llvm/test/Transforms/InstCombine/trunc-fp-to-int.ll
160	This and the following tests are miscompiles (noundef is used here only to prevent the timeout): https://alive2.llvm.org/ce/z/UCo_Py

Using semanticsMaxExponent, so hopefully correct now...

A scalar trunc, to a non-simple type, is still only explored if the input is also a non-simple type but I presume that would be better changed, if at all, in a separate patch.

Avoiding integer comparison warning.

In D142093#4073138, @samparker wrote:

Using semanticsMaxExponent, so hopefully correct now...

A scalar trunc, to a non-simple type, is still only explored if the input is also a non-simple type but I presume that would be better changed, if at all, in a separate patch.

Yes, presumably that's a rarer possibility (and covered by the "wider final type" tests), but we'd need to ease the type check leading into canEvaluateTruncated().

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
502–503	This looks correct, but it's non-obvious, so it could use some explanatory code comments. Also, the code as written had a signed compare warning. I'd rewrite it as something like: // If the integer type can hold the max FP value, it is safe to cast // directly to that type. Otherwise, we may create poison via overflow // that did not exist in the original code. // // The max FP value is pow(2, MaxExponent) * (1 + MaxFraction), so we need // at least one more bit than the MaxExponent to hold the max FP value. Type *InputTy = I->getOperand(0)->getType()->getScalarType(); unsigned MinBitWidth = APFloat::semanticsMaxExponent(InputTy->getFltSemantics()); // We need one more bit to preserve the signbit through truncation. if (I->getOpcode() == Instruction::FPToSI) ++MinBitWidth; return Ty->getScalarSizeInBits() > MinBitWidth;
llvm/test/Transforms/InstCombine/trunc-fp-to-int.ll
43	Put a TODO comment on this since we don't do it yet.
119	Oops - yes, I typo'd the test names for doubles.
137	These look good - please pre-commit the tests with baseline results in a preliminary NFC patch. We should add one more test with an extra use like this: declare void @use(i129) define i128 @float_fptoui_i129_i128_use(float %x) { %i = fptoui float %x to i129 call void @use(i129 %i) %r = trunc i129 %i to i128 ret i128 %r } We won't transform that currently, but we could allow that.

Rebased and added comment.

Harbormaster completed remote builds in B209354: Diff 491340.Jan 23 2023, 7:13 AM

LGTM

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
508	Formatting nit: variable should have a capitalized name.

This revision is now accepted and ready to land.Jan 23 2023, 8:22 AM

This revision was landed with ongoing or failed builds.Jan 24 2023, 1:16 AM

Closed by commit rGb1b7fb6f20b0: [InstCombine] trunc (fptoui|fptosi) (authored by samparker). · Explain Why

This revision was automatically updated to reflect the committed changes.

samparker added a commit: rGb1b7fb6f20b0: [InstCombine] trunc (fptoui|fptosi).

samparker mentioned this in D141926: [WebAssembly] Add passes for GEP lowering.Feb 3 2023, 7:01 AM

Diff 490422

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp

Show First 20 Lines • Show All 241 Lines • ▼ Show 20 Lines	case Instruction::PHI: {
for (unsigned i = 0, e = OPN->getNumIncomingValues(); i != e; ++i) {		for (unsigned i = 0, e = OPN->getNumIncomingValues(); i != e; ++i) {
Value *V =		Value *V =
EvaluateInDifferentType(OPN->getIncomingValue(i), Ty, isSigned);		EvaluateInDifferentType(OPN->getIncomingValue(i), Ty, isSigned);
NPN->addIncoming(V, OPN->getIncomingBlock(i));		NPN->addIncoming(V, OPN->getIncomingBlock(i));
}		}
Res = NPN;		Res = NPN;
break;		break;
}		}
		case Instruction::FPToUI:
		case Instruction::FPToSI:
		Res = CastInst::Create(
		static_cast<Instruction::CastOps>(Opc), I->getOperand(0), Ty);
		break;
default:		default:
// TODO: Can handle more cases here.		// TODO: Can handle more cases here.
llvm_unreachable("Unreachable!");		llvm_unreachable("Unreachable!");
}		}

Res->takeName(I);		Res->takeName(I);
return InsertNewInstWith(Res, *I);		return InsertNewInstWith(Res, *I);
}		}
▲ Show 20 Lines • Show All 228 Lines • ▼ Show 20 Lines	case Instruction::PHI: {
// get into trouble with cyclic PHIs here because we only consider		// get into trouble with cyclic PHIs here because we only consider
// instructions with a single use.		// instructions with a single use.
PHINode *PN = cast<PHINode>(I);		PHINode *PN = cast<PHINode>(I);
for (Value *IncValue : PN->incoming_values())		for (Value *IncValue : PN->incoming_values())
if (!canEvaluateTruncated(IncValue, Ty, IC, CxtI))		if (!canEvaluateTruncated(IncValue, Ty, IC, CxtI))
return false;		return false;
return true;		return true;
}		}
		case Instruction::FPToUI:
		case Instruction::FPToSI:
		if (isGuaranteedNotToBeUndefOrPoison(I->getOperand(0))) {
		if (I->getOperand(0)->getType()->getScalarType()->isHalfTy()) {
		// TODO: For FPToSI, the minimum trunc type is i17 but we only handle
		spatelUnsubmitted Not Done Reply Inline Actions This looks correct, but it's non-obvious, so it could use some explanatory code comments. Also, the code as written had a signed compare warning. I'd rewrite it as something like: // If the integer type can hold the max FP value, it is safe to cast // directly to that type. Otherwise, we may create poison via overflow // that did not exist in the original code. // // The max FP value is pow(2, MaxExponent) * (1 + MaxFraction), so we need // at least one more bit than the MaxExponent to hold the max FP value. Type InputTy = I->getOperand(0)->getType()->getScalarType(); unsigned MinBitWidth = APFloat::semanticsMaxExponent(InputTy->getFltSemantics()); // We need one more bit to preserve the signbit through truncation. if (I->getOpcode() == Instruction::FPToSI) ++MinBitWidth; return Ty->getScalarSizeInBits() > MinBitWidth; spatel:* This looks correct, but it's non-obvious, so it could use some explanatory code comments. Also…
		// simple types while evaluating truncations.
		uint32_t MinBitWidth = I->getOpcode() == Instruction::FPToUI
		? 16 : 32;
		samparkerAuthorUnsubmitted Done Reply Inline Actions And I'm still not sure whether this is needed? Alive seems to want it to be happy, but I pretty sure integer transforms are performed elsewhere without considering fp-to-int conversions as inputs. samparker: And I'm still not sure whether this is needed? Alive seems to want it to be happy, but I pretty…
		nikicUnsubmitted Not Done Reply Inline Actions Can you share the problematic proof? It shouldn't be needed. nikic: Can you share the problematic proof? It shouldn't be needed.
		samparkerAuthorUnsubmitted Done Reply Inline Actions It doesn't make sense to me, but I'm hopeless with FP, so just alive... https://alive2.llvm.org/ce/z/fr5kdx. It will only compile with `--disable-undef-input` or a noundef operand. samparker: It doesn't make sense to me, but I'm hopeless with FP, so just alive... https://alive2.llvm.
		spatelUnsubmitted Not Done Reply Inline Actions This isn't correct - we want to use semanticsMaxExponent() or something like that to determine the minimum bitwidth. Signed cast needs one extra bit to not truncate the sign bit in integer form. spatel: This isn't correct - we want to use semanticsMaxExponent() or something like that to determine…
		return Ty->getScalarSizeInBits() >= MinBitWidth;
		}
		spatelUnsubmitted Not Done Reply Inline Actions Formatting nit: variable should have a capitalized name. spatel: Formatting nit: variable should have a capitalized name.
		}
		break;
default:		default:
// TODO: Can handle more cases here.		// TODO: Can handle more cases here.
break;		break;
}		}

return false;		return false;
}		}

▲ Show 20 Lines • Show All 992 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/trunc-fp-to-int.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -passes=instcombine -S -o - %s \| FileCheck %s

				define i15 @half_fptoui_i32_i15_noundef(half noundef %x) {
				; CHECK-LABEL: @half_fptoui_i32_i15_noundef(
				; CHECK-NEXT: [[CONV:%.]] = fptoui half [[X:%.]] to i32
				; CHECK-NEXT: [[CONV_1:%.*]] = trunc i32 [[CONV]] to i15
				; CHECK-NEXT: ret i15 [[CONV_1]]
				;
				%conv = fptoui half %x to i32
				%conv.1 = trunc i32 %conv to i15
				ret i15 %conv.1
				}

				define i16 @half_fptoui_i32_i16_noundef(half noundef %x) {
				; CHECK-LABEL: @half_fptoui_i32_i16_noundef(
				; CHECK-NEXT: [[CONV:%.]] = fptoui half [[X:%.]] to i16
				; CHECK-NEXT: ret i16 [[CONV]]
				;
				%conv = fptoui half %x to i32
				%conv.1 = trunc i32 %conv to i16
				ret i16 %conv.1
				}

				define <4 x i16> @half_fptoui_4xi32_4xi16_noundef(<4 x half> noundef %x) {
				; CHECK-LABEL: @half_fptoui_4xi32_4xi16_noundef(
				; CHECK-NEXT: [[CONV:%.]] = fptoui <4 x half> [[X:%.]] to <4 x i16>
				; CHECK-NEXT: ret <4 x i16> [[CONV]]
				;
				%conv = fptoui <4 x half> %x to <4 x i32>
				%conv.1 = trunc <4 x i32> %conv to <4 x i16>
				ret <4 x i16> %conv.1
				}

				define i16 @half_fptoui_i32_i16(half %x) {
				; CHECK-LABEL: @half_fptoui_i32_i16(
				; CHECK-NEXT: [[CONV:%.]] = fptoui half [[X:%.]] to i32
				; CHECK-NEXT: [[CONV_1:%.*]] = trunc i32 [[CONV]] to i16
				; CHECK-NEXT: ret i16 [[CONV_1]]
				;
				%conv = fptoui half %x to i32
				%conv.1 = trunc i32 %conv to i16
				ret i16 %conv.1
				spatelUnsubmitted Not Done Reply Inline Actions Put a TODO comment on this since we don't do it yet. spatel: Put a TODO comment on this since we don't do it yet.
				}

				define i16 @half_fptosi_i32_i16_noundef(half noundef %x) {
				; CHECK-LABEL: @half_fptosi_i32_i16_noundef(
				; CHECK-NEXT: [[CONV:%.]] = fptosi half [[X:%.]] to i32
				; CHECK-NEXT: [[CONV_1:%.*]] = trunc i32 [[CONV]] to i16
				; CHECK-NEXT: ret i16 [[CONV_1]]
				;
				%conv = fptosi half %x to i32
				%conv.1 = trunc i32 %conv to i16
				ret i16 %conv.1
				}

				define i17 @half_fptosi_i32_i17_noundef(half noundef %x) {
				; CHECK-LABEL: @half_fptosi_i32_i17_noundef(
				; CHECK-NEXT: [[CONV:%.]] = fptosi half [[X:%.]] to i32
				; CHECK-NEXT: [[CONV_1:%.*]] = trunc i32 [[CONV]] to i17
				; CHECK-NEXT: ret i17 [[CONV_1]]
				;
				%conv = fptosi half %x to i32
				%conv.1 = trunc i32 %conv to i17
				ret i17 %conv.1
				}

				define i17 @half_fptosi_i32_i17(half %x) {
				; CHECK-LABEL: @half_fptosi_i32_i17(
				; CHECK-NEXT: [[CONV:%.]] = fptosi half [[X:%.]] to i32
				; CHECK-NEXT: [[CONV_1:%.*]] = trunc i32 [[CONV]] to i17
				; CHECK-NEXT: ret i17 [[CONV_1]]
				;
				%conv = fptosi half %x to i32
				%conv.1 = trunc i32 %conv to i17
				ret i17 %conv.1
				}

				define i32 @half_fptosi_i64_i32_noundef(half noundef %x) {
				; CHECK-LABEL: @half_fptosi_i64_i32_noundef(
				; CHECK-NEXT: [[CONV:%.]] = fptosi half [[X:%.]] to i32
				; CHECK-NEXT: ret i32 [[CONV]]
				;
				%conv = fptosi half %x to i64
				%conv.1 = trunc i64 %conv to i32
				ret i32 %conv.1
				}

				define <4 x i32> @half_fptosi_4xi64_4xi32_noundef(<4 x half> noundef %x) {
				; CHECK-LABEL: @half_fptosi_4xi64_4xi32_noundef(
				; CHECK-NEXT: [[CONV:%.]] = fptoui <4 x half> [[X:%.]] to <4 x i32>
				; CHECK-NEXT: ret <4 x i32> [[CONV]]
				;
				%conv = fptoui <4 x half> %x to <4 x i64>
				%conv.1 = trunc <4 x i64> %conv to <4 x i32>
				ret <4 x i32> %conv.1
				}

				define i32 @half_fptosi_i64_i32(half %x) {
				; CHECK-LABEL: @half_fptosi_i64_i32(
				; CHECK-NEXT: [[CONV:%.]] = fptosi half [[X:%.]] to i64
				; CHECK-NEXT: [[CONV_1:%.*]] = trunc i64 [[CONV]] to i32
				; CHECK-NEXT: ret i32 [[CONV_1]]
				;
				%conv = fptosi half %x to i64
				%conv.1 = trunc i64 %conv to i32
				ret i32 %conv.1
				}

				define i32 @float_fptoui_i64_i32_noundef(float noundef %x) {
				; CHECK-LABEL: @float_fptoui_i64_i32_noundef(
				; CHECK-NEXT: [[CONV:%.]] = fptoui float [[X:%.]] to i64
				; CHECK-NEXT: [[CONV_1:%.*]] = trunc i64 [[CONV]] to i32
				; CHECK-NEXT: ret i32 [[CONV_1]]
				;
				%conv = fptoui float %x to i64
				%conv.1 = trunc i64 %conv to i32
				ret i32 %conv.1
				}

				define i32 @float_fptosi_i64_i32_noundef(float noundef %x) {
				; CHECK-LABEL: @float_fptosi_i64_i32_noundef(
				; CHECK-NEXT: [[CONV:%.]] = fptosi float [[X:%.]] to i64
				; CHECK-NEXT: [[CONV_1:%.*]] = trunc i64 [[CONV]] to i32
				; CHECK-NEXT: ret i32 [[CONV_1]]
				;
				%conv = fptosi float %x to i64
				%conv.1 = trunc i64 %conv to i32
				ret i32 %conv.1
				}

				define i64 @float_fptoui_i128_i64_noundef(float noundef %x) {
				; CHECK-LABEL: @float_fptoui_i128_i64_noundef(
				; CHECK-NEXT: [[CONV:%.]] = fptoui float [[X:%.]] to i128
				; CHECK-NEXT: [[CONV_1:%.*]] = trunc i128 [[CONV]] to i64
				; CHECK-NEXT: ret i64 [[CONV_1]]
				;
				spatelUnsubmitted Not Done Reply Inline Actions These look good - please pre-commit the tests with baseline results in a preliminary NFC patch. We should add one more test with an extra use like this: declare void @use(i129) define i128 @float_fptoui_i129_i128_use(float %x) { %i = fptoui float %x to i129 call void @use(i129 %i) %r = trunc i129 %i to i128 ret i128 %r } We won't transform that currently, but we could allow that. spatel: These look good - please pre-commit the tests with baseline results in a preliminary NFC patch.
				%conv = fptoui float %x to i128
				%conv.1 = trunc i128 %conv to i64
				ret i64 %conv.1
				}

				define i64 @float_fptosi_i128_i64_noundef(float noundef %x) {
				; CHECK-LABEL: @float_fptosi_i128_i64_noundef(
				; CHECK-NEXT: [[CONV:%.]] = fptosi float [[X:%.]] to i128
				; CHECK-NEXT: [[CONV_1:%.*]] = trunc i128 [[CONV]] to i64
				; CHECK-NEXT: ret i64 [[CONV_1]]
				;
				%conv = fptosi float %x to i128
				%conv.1 = trunc i128 %conv to i64
				ret i64 %conv.1
				}
				spatelUnsubmitted Not Done Reply Inline Actions This and the following tests are miscompiles (noundef is used here only to prevent the timeout): https://alive2.llvm.org/ce/z/UCo_Py spatel: This and the following tests are miscompiles (noundef is used here only to prevent the timeout)…
				efriedmaUnsubmitted Not Done Reply Inline Actions From alive2: define i33 @src(float noundef %x) { %0: %conv = fptosi float noundef %x to i64 %conv.1 = trunc i64 %conv to i33 ret i33 %conv.1 } => define i33 @tgt(float noundef %x) { %0: %conv = fptosi float noundef %x to i33 ret i33 %conv } Transformation doesn't verify! ERROR: Target is more poisonous than source The final result type must be able to hold the largest finite number representable in the floating-point type; otherwise, the transform isn't legal. For float, that's an i128 or i129, I think? AArch64 does have an instruction FJCVTZS you could theoretically use for this kind of thing, but that seems unlikely to be worthwhile. efriedma: From alive2: ``` define i33 @src(float noundef %x) { %0: %conv = fptosi float noundef %x to…

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] trunc (fptoui|fptosi)
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 490422

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp

llvm/test/Transforms/InstCombine/trunc-fp-to-int.ll

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] trunc (fptoui|fptosi)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 490422

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp

llvm/test/Transforms/InstCombine/trunc-fp-to-int.ll

[InstCombine] trunc (fptoui|fptosi)
ClosedPublic