This is an archive of the discontinued LLVM Phabricator instance.

[FPEnv][InstSimplify] Constrained FP support for NaN
ClosedPublic

Authored by kpn on May 26 2021, 7:50 AM.

Details

Summary

Currently InstructionSimplify.cpp knows how to simplify floating point instructions that have a NaN operand. It does not know how to handle the matching constrained FP intrinsic.

This patch teaches it how to simplify so long as the exception handling is not "fpexcept.strict".

Diff Detail

Event Timeline

kpn created this revision.May 26 2021, 7:50 AM
kpn requested review of this revision.May 26 2021, 7:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2021, 7:50 AM

There's a lot going on here. I think we need to break this up to make sure we're testing the corner cases.

What if we start with plain constant folding? I don't see any existing tests for these intrinsics in llvm/test/Transforms/InstSimplify/ConstProp. See fp-undef.ll in that dir for test ideas.
For example, what can we do with this set of tests:

define float @fadd_undef_strict(float %x) #0 {
  %r = call float @llvm.experimental.constrained.fadd.f32(float undef, float undef, metadata !"round.dynamic", metadata !"fpexcept.strict") #0
  ret float %r
}

define float @fadd_undef_maytrap(float %x) #0 {
  %r = call float @llvm.experimental.constrained.fadd.f32(float undef, float undef, metadata !"round.dynamic", metadata !"fpexcept.maytrap") #0
  ret float %r
}

define float @fadd_undef_upward(float %x) #0 {
  %r = call float @llvm.experimental.constrained.fadd.f32(float undef, float undef, metadata !"round.upward", metadata !"fpexcept.ignore") #0
  ret float %r
}

define float @fadd_undef_defaultfp(float %x) #0 {
  %r = call float @llvm.experimental.constrained.fadd.f32(float undef, float undef, metadata !"round.tonearest", metadata !"fpexcept.ignore") #0
  ret float %r
}

What should happen if we replace each of those undef values with a constant number, NaN, etc?

kpn added a comment.May 26 2021, 2:23 PM

There's a lot going on here. I think we need to break this up to make sure we're testing the corner cases.

What if we start with plain constant folding? I don't see any existing tests for these intrinsics in llvm/test/Transforms/InstSimplify/ConstProp. See fp-undef.ll in that dir for test ideas.
For example, what can we do with this set of tests:

define float @fadd_undef_strict(float %x) #0 {
  %r = call float @llvm.experimental.constrained.fadd.f32(float undef, float undef, metadata !"round.dynamic", metadata !"fpexcept.strict") #0
  ret float %r
}

define float @fadd_undef_maytrap(float %x) #0 {
  %r = call float @llvm.experimental.constrained.fadd.f32(float undef, float undef, metadata !"round.dynamic", metadata !"fpexcept.maytrap") #0
  ret float %r
}

define float @fadd_undef_upward(float %x) #0 {
  %r = call float @llvm.experimental.constrained.fadd.f32(float undef, float undef, metadata !"round.upward", metadata !"fpexcept.ignore") #0
  ret float %r
}

define float @fadd_undef_defaultfp(float %x) #0 {
  %r = call float @llvm.experimental.constrained.fadd.f32(float undef, float undef, metadata !"round.tonearest", metadata !"fpexcept.ignore") #0
  ret float %r
}

What should happen if we replace each of those undef values with a constant number, NaN, etc?

Most of the changes are to thread exception behavior through functions that normally work with the normal FP instructions like fadd, fsub, etc.

The constrained intrinsics are then using the same code paths as those normal instructions. Which means that if the exception behavior isn't "fpexcept.strict" then with regards to undef, poison, and NaN you'll get identical results to using the regular FP instructions with identical first two operands. Some optimizations are disabled for non-default FP modes for now at least.

The constrained intrinsics with the _default_ FP mode will give you identical results to the regular instructions _including_ constant folding. At least through InstructionSimplify.cpp, anyway, with this patch.

I started where I did because InstructionSimplify.cpp's foldOrCommuteConstant() function calls into Analysis/ConstantFolding.cpp, but it doesn't yet pass through the rounding mode or exception behavior. So this actually seemed like it would be a smaller, more contained patch than jumping straight to InstSimplify's use of constant folding.

Serge Pavlov has a constant folding patch up as D102673.

I'm off the rest of the week and will be back on Tuesday, June 1.

sepavloff added inline comments.Jun 1 2021, 10:28 AM
llvm/include/llvm/Analysis/InstructionSimplify.h
160–164

Does it make sense to pass rounding mode instead of IsDefaultFPEnv and deduce if environment is default inside the function? Could it be useful for other foldings?

llvm/lib/Analysis/InstructionSimplify.cpp
4827–4828

Even if the exception behavior is ebStrict NaN propagation is not allowed only for signaling NaN, quiet NaN propagates quietly, without setting FP exceptions. Does it make sense to distinguish SNaNs and drop NaN propagation only for them?

4853–4854

With the exception of fadd X, -0, which may produce different results for different rounding modes, other simplifications seem to be useful in non-default environment. Which of the cases cannot be made in such environment?

llvm/test/Transforms/InstSimplify/fp-undef-poison-strictfp.ll
22

Should this call produce undef value instead?

46

According to documentation (https://llvm.org/docs/LangRef.html#poison-values) "An instruction that depends on a poison value, produces a poison value itself". So making call here is incorrect, it would result in lost of poison.

kpn added a comment.Jun 3 2021, 12:50 PM

There's a lot going on here. I think we need to break this up to make sure we're testing the corner cases.

The meat of this is in InstructionSimplify.cpp's simplifyFPOp() and propagateNaN(). I don't see how we can split this up on the undef/poison/NaN axis without complicating those functions.

I could split this up on the fadd/fsub/fmul/etc axis and only handle constrained fadd, for example. Would that be helpful? Your current thoughts, @spatel?

llvm/include/llvm/Analysis/InstructionSimplify.h
160–164

I believe you are correct. I'm passing the bool IsDefaultFPEnv here because the current easy way to check for the default environment is to ask the ConstrainedIntrinsic about the environment. One can just look at the rounding mode and exception handling oneself, but that check is a little too complicated for my taste to have littered all over the place.

How about a llvm::fp::IsDefaultFPEnv() function that takes the rounding mode and exception behavior and does the check itself? It's more readable to someone who hasn't been following along for the past three years. It's more concise. And it would let us replace this bool with an actual rounding mode argument.

llvm/lib/Analysis/InstructionSimplify.cpp
4827–4828

For ebStrict we leave the instruction alone and let the run-time environment handle it.

A check of IEEE 754 seems to say that, when given an SNaN, an operation that produces a result will produce a QNaN. But I'm not a 754 expert. Anyone else? Please?

If you look above at the propagateNaN() function you'll see that someone thought of this and didn't have an answer. They left a comment asking the question. Was that @spatel?

I don't think a conversion to QNaN should be confined to the constrained intrinsics. It would be surprising if, for example, someone changed to a different rounding mode and suddenly NaN handling changed. So I think we should keep the codepath here the same for constrained and regular floating point. And changing propagateNaN() should be left for a later patch. Does that sound reasonable?

4853–4854

I don't know. I didn't want to put too much into one patch so I just took the conservative approach and bailed out here. If I made an error with these optimizations then a straight revert of the patch would make a mess.

I figure we can revisit later?

llvm/test/Transforms/InstSimplify/fp-undef-poison-strictfp.ll
22

This comes from propagateNaN(), which spins up a NaN if handed a value that is not a NaN. That's where the undef is changed into a NaN.

Since an undef is the set of all possible bit patterns then that set includes NaN. If I understand undef correctly we're allowed to pick a value from that set and use it. So the conversion of undef into NaN is allowed.

Plus, the current code does this already. So I'm inclined to leave it alone and have this conversion produce NaN.

46

Yes, but the very next sentence is "A poison value may be relaxed into an undef value, ...".

Using the regular instructions we have this difference:
add x, poison -> poison
fadd x, poison -> NaN

I don't know exactly why, but that's the existing practice. The constrained intrinsics should be the same as the regular FP instructions.

spatel added a comment.Jun 4 2021, 7:22 AM

I could split this up on the fadd/fsub/fmul/etc axis and only handle constrained fadd, for example. Would that be helpful? Your current thoughts, @spatel?

Yes, that would help me understand more of the patterns at least. But if other reviewers are comfortable with the all-in-one patch, then I won't hold this up.

Could you please split this patch so that folding Nans would be separated from treatment of poison and undefs?

In contrast to poison and undef, Nan is a real value, which has definite representation in runtime, it is clear how to process it. Poison value is a way to propagate errors such as undefined behavior. It exists only in compile time. Of course, lowering it to NaN in instruction selector is natural, but conversion of poison into NaN in IR transformation looks like incorrect decision except some cases (like NaN*poison -> NaN). Yes, the documentation mentions that "A poison value may be relaxed into an undef value", but it is unclear when it is allowed. Undefs are even more obscure. The use case of input data, which should not influence result of an operation is clear but undefs are used in much more cases. I feel review of such transformation could be more complex.

As for Nans, such transformation is useful and anticipated.

llvm/include/llvm/Analysis/InstructionSimplify.h
160–164

How about a llvm::fp::IsDefaultFPEnv() function that takes the rounding mode and exception behavior and does the check itself?

I like this idea. In general it is profitable to have interface that can be extended.

llvm/lib/Analysis/InstructionSimplify.cpp
4853–4854

It is Ok to postpone this issue.

kpn updated this revision to Diff 351235.Jun 10 2021, 12:18 PM
kpn retitled this revision from [FPEnv][InstSimplify] Constrained FP support for undef, poison, and NaN to [FPEnv][InstSimplify] Constrained FP support for NaN.
kpn edited the summary of this revision. (Show Details)

Update for review comments and reduce to just operating on NaN.

sepavloff added inline comments.Jun 14 2021, 10:21 AM
llvm/lib/Analysis/InstructionSimplify.cpp
4900

Is default environment is necessary to make constant folding? "fpexcept.maytrap" also allows constant evaluation and any rounding except dynamic does not prevent the evaluation. Maybe check only for ExBehavior != fp::ebStrict?

4951

In contrast to other Simplify* functions there is no call to foldOrCommuteConstant. May it be put here?

llvm/test/Transforms/InstSimplify/X86/fp-nan-strictfp.ll
305

Calculation of remainder does not depend on rounding mode, so this test is identical to the next.

spatel added inline comments.Jun 16 2021, 6:32 AM
llvm/test/Transforms/InstSimplify/fp-undef-poison-strictfp.ll
46

Sorry for not seeing this and possibly other inline comments sooner.

The lack of poison propagation here is a TODO item (inconsistency bug), so let's try to get this out of the way before it gets more complicated. I'll post a patch for review shortly.

The history is that poison did not exist when I added the fold for undef. Undef behavior is more complicated than poison, (and that's why we're phasing it out).

We can't do "fadd x, undef --> undef" because if we deduce something about the variable input x, the result may not take on *any* possible FP value. This is similar to why "and x, undef --> 0" (not undef).

Please see D104383 for an update to regular FP instruction simplification.

kpn added inline comments.Jun 16 2021, 8:29 AM
llvm/lib/Analysis/InstructionSimplify.cpp
4900

Well, we'd also need to pass down the exception behavior because it'd be a shame to not fold 2.0+2.0, but we do want to avoid folding to (for example) infinity in the ebStrict case. So constant folding is something I'm hoping to put off for a later patch. This patch is intended to be restricted to NaN.

The codepath for the normal fsub and other normal FP instructions don't have any metadata to pass in. Thus they will pick up the default arguments in this function which is for the default FP environment. So the call to IsDefaultFPEnvironment() is a check for either the normal FP instruction or a constrained instruction that is supposed to behave the same as the normal FP instruction. Put another way, this check for the default FP environment preserves the existing behavior and avoids risking another source of bugs that could trigger a revert.

4951

This function only gets called from SimplifyFMulInst(), and that function does have the call to foldOrCommuteConstant(). I'm guessing that's why this code doesn't already have the call in it. Since adding it wouldn't change anything I think we should leave it alone, at least in this patch.

llvm/test/Transforms/InstSimplify/fp-undef-poison-strictfp.ll
46

I'll change this patch when D104383 lands.

kpn added a comment.Jun 16 2021, 8:33 AM

Phab just lost a comment of mine. Weird.

llvm/test/Transforms/InstSimplify/X86/fp-nan-strictfp.ll
305

Changing the rounding mode means we're no longer in the default FP environment. This then changes the codepath through SimplifyFRemInst(). That's why I put this test here.

spatel added inline comments.Jun 16 2021, 8:37 AM
llvm/test/Transforms/InstSimplify/fp-undef-poison-strictfp.ll
46

Thanks - ce95200b7942
Can you also pre-commit the tests with the baseline CHECK lines? That would make it easier to see what is (and what is not) changing with this patch.

spatel added inline comments.Jun 16 2021, 8:47 AM
llvm/lib/Analysis/InstructionSimplify.cpp
4827–4828

Yes, I left that comment. That was partly discussed in D44521.
I still don't know what we want to do about SNaN. If we can make a stand-alone patch to answer just that question for the constrained intrinsics, that would be nice.

kpn added inline comments.Jun 16 2021, 9:01 AM
llvm/lib/Analysis/InstructionSimplify.cpp
4827–4828

You want the regular FP instructions to be different from the constrained intrinsics? What about the rounding mode example above? Perhaps have NaN be handled differently for maytrap or strict exception handling? Still, I'm pretty sure we should keep the regular and constrained FP operations behaving the same whenever possible.

I think I can have a new patch up for discussion not too long after this patch lands.

spatel added inline comments.Jun 16 2021, 11:50 AM
llvm/lib/Analysis/InstructionSimplify.cpp
4827–4828

If we can make the behavior consistent across regular and intrinsics, that would be good. Start an llvm-dev thread to discuss different examples? I don't know if anything has changed in the more recent IEEE-754 specs.

kpn updated this revision to Diff 352785.Jun 17 2021, 10:40 AM

Rebase after a precommit of the tests.

spatel added inline comments.Jun 18 2021, 1:01 PM
llvm/lib/Analysis/InstructionSimplify.cpp
5043–5045

This and similar changes are opening the door to every existing FP fold. That seems safe enough given that we're in the default environment, but there are no tests for this unless I've missed it?

Can we make that a small follow-on patch with some test coverage that includes FMF, so we know it's all getting piped through as expected.

kpn added inline comments.Jun 18 2021, 2:37 PM
llvm/lib/Analysis/InstructionSimplify.cpp
5043–5045

You want some tests to verify that the below folds work correctly when given constrained intrinsics with FMF? I think that's what you are asking. Since we have no tests for that currently, yes, we can do that. Unless I misunderstood?

spatel added inline comments.Jun 18 2021, 2:46 PM
llvm/lib/Analysis/InstructionSimplify.cpp
5043–5045

Yes, I'd add at least 1 test for each intrinsic, and sprinkle some FMF-required folds for more better coverage. For example, I think we want these to fold?

define float @fdiv_by 1(float %x) {
  %r = call float @llvm.experimental.constrained.fdiv.f32(float %x, float 1.0, metadata !"round.tonearest", metadata !"fpexcept.ignore")
  ret float %r
}

define float @fdiv_same_op_nnan(float %x) {
  %r = call nnan float @llvm.experimental.constrained.fdiv.f32(float %x, float %x, metadata !"round.tonearest", metadata !"fpexcept.ignore")
  ret float %r
}
kpn updated this revision to Diff 355300.Jun 29 2021, 11:05 AM

Add tests showing the folds that we are now using, and show that many of the matchers are not firing yet. Also test that the fast math flags are propagated correctly.

spatel added inline comments.Jul 6 2021, 11:21 AM
llvm/test/Transforms/InstSimplify/fast-math-strictfp.ll
65 ↗(On Diff #355300)

Not part of this patch, but is the plan to convert the constrained op(s) into regular ops in this situation?

llvm/test/Transforms/InstSimplify/fp-undef-poison-strictfp.ll
74

There was a bug in poison propagation -- hopefully fixed with 4ec7c021970d -- but that doesn't explain why this and similar tests below are not returning poison. Update/regenerate the CHECK lines?

kpn added inline comments.Jul 6 2021, 11:32 AM
llvm/test/Transforms/InstSimplify/fast-math-strictfp.ll
65 ↗(On Diff #355300)

No. Mixing of regular FP ops and constrained ops in a single function is not allowed since we can't prevent harmful reordering. So the only case where it might be legal is where an entire function uses the constrained intrinsics in the default FP environment. In that case ... why are the constrained intrinsics being used, anyway?

llvm/test/Transforms/InstSimplify/fp-undef-poison-strictfp.ll
74

Will do, tomorrow.

kpn updated this revision to Diff 356980.Jul 7 2021, 9:07 AM

Update a test for changes nearby in poison handling. Now when we get poison we produce poison, so the test needed to be changed.

spatel accepted this revision.Jul 7 2021, 11:52 AM

LGTM - see inline for a couple of minor points. Thanks for working on this!

llvm/include/llvm/IR/FPEnv.h
58

Nit - capitalization is opposite of current guidelines:

isDefaultFPEnvironment(fp::ExceptionBehavior EB, RoundingMode RM)
llvm/lib/Transforms/Utils/Local.cpp
478–482

Reduces to:

return FPI->getExceptionBehavior() != fp::ebStrict;

?

This revision is now accepted and ready to land.Jul 7 2021, 11:52 AM
This revision was landed with ongoing or failed builds.Jul 9 2021, 8:27 AM
This revision was automatically updated to reflect the committed changes.
qiucf added a subscriber: qiucf.Jul 15 2021, 12:04 AM

Hi, I notice poison is generated for maytrap constrained 0 * Inf operation, but NaN for non-constrained ones. Is that expected?

define float @foo() {
entry:
  %x = tail call ninf contract afn float @llvm.experimental.constrained.fmul.f32(float 0.000000e+00, float 0x7FF0000000000000, metadata !"round.tonearest", metadata !"fpexcept.maytrap")
  ret float %x
}

# =>

define float @foo() {
entry:
  ret float poison
}
define float @foo() {
entry:
  %x = fmul ninf contract afn float 0.000000e+00, 0x7FF0000000000000
  ret float %x
}

# =>

define float @foo() {
entry:
  ret float 0x7FF8000000000000
}

Because constrained ones are handled by simplifyFPOp while non-constrained ones are handled by ConstantFoldBinaryInstruction. Should we make this consistent?

kpn added a comment.Jul 15 2021, 9:14 AM

Hi, I notice poison is generated for maytrap constrained 0 * Inf operation, but NaN for non-constrained ones. Is that expected?

define float @foo() {
entry:
  %x = tail call ninf contract afn float @llvm.experimental.constrained.fmul.f32(float 0.000000e+00, float 0x7FF0000000000000, metadata !"round.tonearest", metadata !"fpexcept.maytrap")
  ret float %x
}

# =>

define float @foo() {
entry:
  ret float poison
}
define float @foo() {
entry:
  %x = fmul ninf contract afn float 0.000000e+00, 0x7FF0000000000000
  ret float %x
}

# =>

define float @foo() {
entry:
  ret float 0x7FF8000000000000
}

Because constrained ones are handled by simplifyFPOp while non-constrained ones are handled by ConstantFoldBinaryInstruction. Should we make this consistent?

Yes, we should be consistent. The right, immediate fix for this is to have simplifyFPOp treat undef the same in the default and non-default FP environments. Right now the patch specifically excludes undef folding. This was done on purpose in this patch because of questions about how we're supposed to handle undef.

Is undef supposed to behave the same in all FP environments, default or not? I asked on llvm-dev and got no response: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151727.html

kpn added a comment.Jul 15 2021, 9:17 AM

Well, OK, partially wrong. The undef question needs to be answered, yes. But what we really need is to hook into the constant folding that we're reaching via foldOrCommuteConstant() but are currently avoiding except in the default FP environment. That'll take longer to fix, but D102673 is a step in the right direction.

Well, OK, partially wrong. The undef question needs to be answered, yes. But what we really need is to hook into the constant folding that we're reaching via foldOrCommuteConstant() but are currently avoiding except in the default FP environment. That'll take longer to fix, but D102673 is a step in the right direction.

The problem for this example is that we are not passing FMF through to constant folding. The fmul has ninf with an INF operand, so that's poison:
https://alive2.llvm.org/ce/z/DTvAX4

We get that result accidentally at the moment for constrained ops because we don't have constant folding for constrained ops. Once we land D102673, the result for constrained ops will change to NaN. So we'll be consistent, but not ideal for both cases. :)

Returning NaN is not a miscompile because we can soften poison to whatever we want, but it's not as strong an optimization as folding to poison.