Attempt to fold the trunc into the fp-to-int conversion.
Details
Diff Detail
Unit Tests
Event Timeline
llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp | ||
---|---|---|
506 | And I'm still not sure whether this is needed? Alive seems to want it to be happy, but I pretty sure integer transforms are performed elsewhere without considering fp-to-int conversions as inputs. |
I realize it's unlikely in practice, but is there a reason not to support any FP type? For fptoui, the integer type just needs one more bit than the max exponent for a given FP semantic?
For the tests, it should be sufficient to have the intermediate integer width be one more than the minimum required type width, so "%i = fptoui half %x to i17".
Please pre-commit baseline tests (either locally or push to main) and label tests that should not change as negative tests (either in the function name or with a code comment).
it should be sufficient to have the intermediate integer width be one more than the minimum required type width, so "%i = fptoui half %x to i17".
IIUC, for half fptoui we don't need an i17, as an i16 can hold the max normal value (65504). I can add support in for float conversions though, as this logic is only triggering for simple types, I assume the only conversion that will work is float -> i128 -> i64.
I would really appreciate if someone could help me understand the complication with fptosi w.r.t checking for poison/undef too.
llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp | ||
---|---|---|
506 | Can you share the problematic proof? It shouldn't be needed. |
llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp | ||
---|---|---|
506 | It doesn't make sense to me, but I'm hopeless with FP, so just alive... https://alive2.llvm.org/ce/z/fr5kdx. It will only compile with --disable-undef-input or a noundef operand. |
i16 is the smallest final type for fptoui; i17 is one bit bigger because we need to truncate at least one bit. We don't really want a "simple type" limit here in IR either unless there's some codegen concern. I'd add tests with float and bfloat. There's a current discussion about adding various other small format FP types to IR, so we should try to future-proof this transform for those types in case they make it into IR.
I would really appreciate if someone could help me understand the complication with fptosi w.r.t checking for poison/undef too.
There is no undef problem - I think it's just that the online instance times out with larger widths:
https://alive2.llvm.org/ce/z/6EZXLQ
Okay, great. Thanks for clarification on both fronts. I'm just about to commit some tests.
If you want to add some more, the initial set is in: https://github.com/llvm/llvm-project/commit/714286f9e641209411609deaf80dd865aa2198c5
llvm/test/Transforms/InstCombine/trunc-fp-to-int.ll | ||
---|---|---|
357 | From alive2: define i33 @src(float noundef %x) { %0: %conv = fptosi float noundef %x to i64 %conv.1 = trunc i64 %conv to i33 ret i33 %conv.1 } => define i33 @tgt(float noundef %x) { %0: %conv = fptosi float noundef %x to i33 ret i33 %conv } Transformation doesn't verify! ERROR: Target is more poisonous than source The final result type must be able to hold the largest finite number representable in the floating-point type; otherwise, the transform isn't legal. For float, that's an i128 or i129, I think? AArch64 does have an instruction FJCVTZS you could theoretically use for this kind of thing, but that seems unlikely to be worthwhile. |
I updated the test file; see if this covers everything for fptoui:
cb29ba9c0f87
(if yes, then we duplicate each test for fptoui with one extra bit for the integer types)
llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp | ||
---|---|---|
503–506 | This isn't correct - we want to use semanticsMaxExponent() or something like that to determine the minimum bitwidth. Signed cast needs one extra bit to not truncate the sign bit in integer form. | |
llvm/test/Transforms/InstCombine/trunc-fp-to-int.ll | ||
160 | This and the following tests are miscompiles (noundef is used here only to prevent the timeout): |
Using semanticsMaxExponent, so hopefully correct now...
A scalar trunc, to a non-simple type, is still only explored if the input is also a non-simple type but I presume that would be better changed, if at all, in a separate patch.
Yes, presumably that's a rarer possibility (and covered by the "wider final type" tests), but we'd need to ease the type check leading into canEvaluateTruncated().
llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp | ||
---|---|---|
502–503 | This looks correct, but it's non-obvious, so it could use some explanatory code comments. Also, the code as written had a signed compare warning. I'd rewrite it as something like: // If the integer type can hold the max FP value, it is safe to cast // directly to that type. Otherwise, we may create poison via overflow // that did not exist in the original code. // // The max FP value is pow(2, MaxExponent) * (1 + MaxFraction), so we need // at least one more bit than the MaxExponent to hold the max FP value. Type *InputTy = I->getOperand(0)->getType()->getScalarType(); unsigned MinBitWidth = APFloat::semanticsMaxExponent(InputTy->getFltSemantics()); // We need one more bit to preserve the signbit through truncation. if (I->getOpcode() == Instruction::FPToSI) ++MinBitWidth; return Ty->getScalarSizeInBits() > MinBitWidth; | |
llvm/test/Transforms/InstCombine/trunc-fp-to-int.ll | ||
124–134 | Put a TODO comment on this since we don't do it yet. | |
375–376 | Oops - yes, I typo'd the test names for doubles. | |
394 | These look good - please pre-commit the tests with baseline results in a preliminary NFC patch. We should add one more test with an extra use like this: declare void @use(i129) define i128 @float_fptoui_i129_i128_use(float %x) { %i = fptoui float %x to i129 call void @use(i129 %i) %r = trunc i129 %i to i128 ret i128 %r } We won't transform that currently, but we could allow that. |
LGTM
llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp | ||
---|---|---|
508 | Formatting nit: variable should have a capitalized name. |
This looks correct, but it's non-obvious, so it could use some explanatory code comments. Also, the code as written had a signed compare warning.
I'd rewrite it as something like: