This is an archive of the discontinued LLVM Phabricator instance.

[ConstantFolding] Fold constrained compare intrinsics
ClosedPublic

Authored by sepavloff on Sep 23 2021, 5:03 AM.

Details

Summary

The change implements constant folding of ‘llvm.experimental.constrained.fcmp’
and ‘llvm.experimental.constrained.fcmps’ intrinsics.

Diff Detail

Event Timeline

sepavloff created this revision.Sep 23 2021, 5:03 AM
sepavloff requested review of this revision.Sep 23 2021, 5:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 23 2021, 5:03 AM
bjope resigned from this revision.Sep 23 2021, 5:53 AM

Sorry, I don't know much about fcmp.

sepavloff updated this revision to Diff 374813.Sep 24 2021, 5:36 AM

Add check lines to the test

kpn added inline comments.Sep 27 2021, 11:00 AM
llvm/lib/Analysis/ConstantFolding.cpp
1798

This still worries me. Are _analysis_ passes allowed to change the input IR? What if the caller decides to not do a fold after calling this function to see if a fold is possible? And why is the ebIgnore case different from the other two?

Constant folding won't add this ReadNone attribute when the constant folding code doesn't have an Instruction to alter. I don't know what to do about that.

llvm/lib/IR/Instructions.cpp
4134 ↗(On Diff #374511)

Besides moving this code to a new function, are there any changes to it? It's hard to tell.

llvm/test/Transforms/InstSimplify/constfold-constrained.ll
420

There are no "maytrap" tests here. For the sake of future readers it would probably be useful to show that "maytrap" was taken into account and is working correctly.

Added tests with "maytrap"

nikic added a subscriber: nikic.Sep 30 2021, 12:20 PM
nikic added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1798

Analysis passes are indeed not allowed to change the IR. And this isn't a harmless change either if it gets left behind -- e.g. it invalidates MemorySSA.

sepavloff added inline comments.Sep 30 2021, 11:31 PM
llvm/lib/Analysis/ConstantFolding.cpp
1798

Analysis passes are indeed not allowed to change the IR.

Constant folding is not an analysis, it changes IR. I don't know why this file is in Analysis directory.

What if the caller decides to not do a fold after calling this function to see if a fold is possible?

Nothing bad would happen. Only side effect would be removed, but we know for sure that it is absent, we just evaluated the operation. The function call would not be removed if its result is used.

And why is the ebIgnore case different from the other two?

If exception behavior is ebIgnore, such calls will get attribute SDNodeFlags::NoFPExcept in DAG and such instructions do not have side effects. But setting ReadNone for instructions with ebIgnore allows removal of these instructions earlier, at IR level, which could have positive effect.

And this isn't a harmless change either if it gets left behind -- e.g. it invalidates MemorySSA.

Comparison of two floating point numbers do not use memory access. But it can change bits in the floating point state register (only Invalid bit can be set). This change is emulated as memory access so that instruction be ordered correctly. This is why constrained intrinsics declared with attribute IntrInaccessibleMemOnly. As no actual memory access occurs, it is harmless to set ReadNone in this case.

llvm/lib/IR/Instructions.cpp
4134 ↗(On Diff #374511)

No, this is only a code moving.

llvm/test/Transforms/InstSimplify/constfold-constrained.ll
420

Added tests with "maytrap".

Set ReadNone for all intrinsics if they can be evaluated without raising signals

nikic added a subscriber: lebedev.ri.
nikic added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1798

Constant folding is not an analysis, it changes IR. I don't know why this file is in Analysis directory.

The constant folding analysis does not change IR. Users of the constant folding analysis change IR based on the analysis result.

Comparison of two floating point numbers do not use memory access. But it can change bits in the floating point state register (only Invalid bit can be set). This change is emulated as memory access so that instruction be ordered correctly. This is why constrained intrinsics declared with attribute IntrInaccessibleMemOnly. As no actual memory access occurs, it is harmless to set ReadNone in this case.

My point here was that adding a readnone attribute invalidates MemorySSA, because it means that the instruction should no longer have a MemoryAccess -- it would result in a verification failure. Just calling ConstFold/InstSimplify should never have this kind of effect.

Unless @spatel or @lebedev.ri tell me I'm wrong here, I believe this should be considered a blocker for further work in this area.

lebedev.ri requested changes to this revision.Oct 2 2021, 9:36 AM

Unless @spatel or @lebedev.ri tell me I'm wrong here, I believe this should be considered a blocker for further work in this area.

This revision now requires changes to proceed.Oct 2 2021, 9:36 AM
sepavloff updated this revision to Diff 388475.Nov 19 2021, 5:22 AM

Move attribute modification to InstSimplify pass

Any feedback?

craig.topper added a comment.EditedNov 29 2021, 1:47 PM

Should the change that effects fadd, fma, etc. tests be moved to a different patch that is a pre-requisite of the compare change?

llvm/lib/Analysis/ConstantFolding.cpp
2312

Does ConstrainedFPCmpIntrinsic have any method for determining fcmp vs fcmps? If not should it?

llvm/lib/IR/Instructions.cpp
4243 ↗(On Diff #388475)

This feels a little like it shouldn't be part of FCmpInst, but I don't have a concrete suggestion of where to put it instead. ConstantFold.cpp feels like the right home, but unfortunatley the header for that isn't visible to ConstantFolding.cpp.

sepavloff updated this revision to Diff 391591.Dec 3 2021, 2:33 AM

Addressed reviewer's notes

Should the change that effects fadd, fma, etc. tests be moved to a different patch that is a pre-requisite of the compare change?

Yes, it is a good idea to move this debatable change into a separate patch, it is here: D114766. This patch does not depend on it.

llvm/lib/Analysis/ConstantFolding.cpp
2312

Probably, but it does not help here. The interface is designed to evaluate constant value without construction of a node. ConstrainedFPCmpIntrinsic can be extracted from Call but checking intrinsic ID seems simpler.

llvm/lib/IR/Instructions.cpp
4243 ↗(On Diff #388475)

I moved ConstantFold to include directory in dependency patch and put evaluatePredicate there.

nikic added inline comments.Dec 3 2021, 8:26 AM
llvm/lib/Analysis/ConstantFolding.cpp
2312

FYI recently ICmpInst::compare() was added for the corresponding operation on ICmpInsts (in https://github.com/llvm/llvm-project/commit/25043c8276644e684f8d14cd4cadaa87a7e99b0e), so it might make sense to have the same method on FCmpInst. (No strong opinion on placement though.)

craig.topper added inline comments.Dec 3 2021, 4:00 PM
llvm/lib/Analysis/ConstantFolding.cpp
2310

auto *FCmp I believe LLVM coding standards prefer to keep the * around with auto so things that are pointers are obvious.

2312

The ConstrainedFPCmpIntrinsic is already being extracted from the call to get the predicate. So I thought it would make sense to get the signaling state from it as well and not pass the intrinsic ID at all.

sepavloff updated this revision to Diff 392020.Dec 6 2021, 4:14 AM

Changed arguments of evaluateCompare and rebased

sepavloff marked an inline comment as done.Dec 6 2021, 6:01 AM
sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
2312

Changed signature of evaluateCompare and fixed interface of ConstrainedFPCmpIntrinsic accordingly.

Is there anything I can do to make this patch acceptable?

kpn added a comment.Dec 15 2021, 8:43 AM

My guess:

Removing the addition of the readnone attribute? It's a correctness issue. We don't have a solution to the performance issue in generated code, but correctness beats performance every time.

My guess:

Removing the addition of the readnone attribute? It's a correctness issue. We don't have a solution to the performance issue in generated code, but correctness beats performance every time.

This patch now does not touch this attribute, this code is already removed.

kpn added inline comments.Dec 15 2021, 9:16 AM
llvm/lib/Analysis/ConstantFolding.cpp
1798

Here. These two lines still need to be removed. Along with the comment, of course.

sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1798

Done in the separate patch: https://reviews.llvm.org/D115870

spatel added inline comments.Dec 16 2021, 5:57 AM
llvm/lib/IR/ConstantFold.cpp
2021 ↗(On Diff #392020)

This was suggested in an earlier comment - can we add this next to:
https://github.com/llvm/llvm-project/blob/80b1d472d6f7e10403e37cc74bfda278372ecb97/llvm/lib/IR/Instructions.cpp#L4135

(and adjust the names/parameters for consistency)

That can be a preliminary/NFC commit if I'm seeing it correctly.

sepavloff added inline comments.Dec 16 2021, 6:42 AM
llvm/lib/IR/ConstantFold.cpp
2021 ↗(On Diff #392020)

Previously such method existed but I was recommended to remove it: https://reviews.llvm.org/D110322?id=388475#3159399.

Otherwise there is no problem to restore that implementation.

I have no objections to this patch now. I'd pre-commit the baseline tests though, and let's see if there's consensus on where to house the helper function.

llvm/lib/IR/ConstantFold.cpp
2021 ↗(On Diff #392020)

Ah, sorry I didn't see that earlier comment/revision.
@craig.topper - do you still prefer the current structure?

craig.topper added inline comments.Dec 18 2021, 11:23 PM
llvm/lib/IR/ConstantFold.cpp
2021 ↗(On Diff #392020)

I think the integer change happened while this patch was in review. I'm fine with restoring the previous implementation to make this consistent with integer.

sepavloff updated this revision to Diff 395476.Dec 20 2021, 9:15 AM

Put back method of FCmpInst for constant comparison evaluation.

Please pre-commit the baseline tests and FCmpInst::compare(), so we're left with only real logic changes and tests that show the change in behavior.

sepavloff updated this revision to Diff 395882.Dec 22 2021, 8:48 AM

FCmpInst::compare is moved to separate patch

sepavloff updated this revision to Diff 401959.Jan 21 2022, 5:24 AM

Rebased patch

Any feedback?

kpn accepted this revision.Feb 1 2022, 11:39 AM

LGTM

This revision was not accepted when it landed; it landed in state Needs Review.Feb 3 2022, 1:47 AM
This revision was automatically updated to reflect the committed changes.
bgraur added a subscriber: bgraur.Feb 25 2022, 1:57 PM

Hi,

This commit causes clang to crash when compiling the attached reproducer with the following compilation command:

clang -cc1 \
  -emit-obj \
  -target-feature +sse4.2 \
  -frounding-math \
  -O1 \
  -std=gnu++17 \
  -fsized-deallocation \
  -o /tmp/repro.o -x c++ repro.cc

Please note that the crash only reproduces something like 60% of cases (just re-run a few times).
Please revert.

Hi,

This commit causes clang to crash when compiling the attached reproducer with the following compilation command:

clang -cc1 \
  -emit-obj \
  -target-feature +sse4.2 \
  -frounding-math \
  -O1 \
  -std=gnu++17 \
  -fsized-deallocation \
  -o /tmp/repro.o -x c++ repro.cc

Please note that the crash only reproduces something like 60% of cases (just re-run a few times).
Please revert.

Thank you for the report. There was an issue with treatment of vector constants. Commit https://reviews.llvm.org/rG6982c38cb120 fixes it.