The code incorrectly inferred that the relationship of a constant expression X to itself is FCMP_OEQ (ordered and equal), when it's actually FCMP_UEQ (unordered or equal).
This corrects that, and adds some more limited folds that can be done in this case.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
What i think is missing here, is that
The code incorrectly inferred that the relationship of a constant expression X to itself is FCMP_OEQ (ordered and equal), when it's actually FCMP_UEQ (unordered or equal).
simply states it as the matter of a fact.
Clearly, since this patch is here, the previous choice was wrong (i guess?).
So some more blurb with explanation as to why it is so would be good.
(i'm guessing nan / inf etc?)
Clearly, since this patch is here, the previous choice was wrong (i guess?).
So some more blurb with explanation as to why it is so would be good.
(i'm guessing nan / inf etc?)
First, I'll note that usually when trying to fold a FCmp with constant operands, the operands will be ConstantFP instead of ConstantExpr, so they'll hit the code path at ConstantFold.cpp:1789. This code path is only used when one of the operands is constant, but not a literal floating-point value, or I think this bug would be breaking lots of basic stuff. I couldn't find a way to hit this path with Clang, for example.
If V1 == V2 in evaluateFCmpRelation, you can infer that the value of V1 must be the same as the value of V2. But the value may be a NaN, in which case the relation is "unordered". So you cannot infer that the relation is ordered *and* equal, you can only infer that it must be unordered *or* equal.
The incorrectly inferred relation caused ConstantFoldCompareInstruction to fold predicate(X,X) to a literal i1 for all predicates, but it should only be possible to fold it to a literal i1 for two predicates: FCMP_ONE (ordered and not equal, which must be false) and FCMP_UEQ (unordered or equal, which must be true).
Looking at the test now, I think it might be possible to get rid of the vector->bitcast->extractelement path. When I wrote it, it was the only way I found to get an unfolded ConstantExpr through to CreateFCmp. I think it might work to bitcast a global variable pointer to a float, though.
I made the test a little simpler by bitcasting a GlobalVariable pointer to a float to get a FP typed ConstantExpr, instead of using a bitcast vector.
But the value may be a NaN, in which case the relation is "unordered".
Hm, can we use fast-math flags here? Namely, nnan.
So you cannot infer that the relation is ordered *and* equal, you can only infer that it must be unordered *or* equal.
The incorrectly inferred relation caused ConstantFoldCompareInstruction to fold predicate(X,X) to a literal i1 for all predicates, but it should only be possible to fold it to a literal i1 for two predicates: FCMP_ONE (ordered and not equal, which must be false) and FCMP_UEQ (unordered or equal, which must be true).
Looking at the test now, I think it might be possible to get rid of the vector->bitcast->extractelement path. When I wrote it, it was the only way I found to get an unfolded ConstantExpr through to CreateFCmp. I think it might work to bitcast a global variable pointer to a float, though.
Hm, can we use fast-math flags here? Namely, nnan.
It's possible, but not a trivial change. I added a nullable const FastMathFlags* parameter to:
- evaluateFCmpRelation and ConstantFoldCompareInstruction
- ConstantExpr::getCompare and ConstantExpr::getFCmp
- TargetFolder::CreateFCmp, ConstantFolder::CreateFCmp, and NoFolder
Then, IRBuilder::CreateFCmp can pass its current FastMathFlags through to the constant folder. Most places in LLVM I saw that were calling IRBuilder::CreateFCmp seemed to be setting the FastMathFlags, but there were a few transforms that didn't seem to be propagating them through:
- foldLogicOfFCmps in InstCombineAndOrXor.cpp
- foldVectorCmp in InstCombineCompares.cpp
- foldOpIntoPhi in InstructionCombining.cpp (line 993)
- FCmpSplitter in Scalarizer.cpp
- vectorizeTree(TreeEntry*) in SLPVectorizer.cpp (line 3223)
I stopped trying to propagate the fast-math flags dependency out when I got to ConstantFolding.cpp (i.e. the constant folding pass rather than the IRBuilder constant folder). Since ConstantExpr doesn't have fast-math flags, there isn't a way to transmit the fast-math flags from whatever produced the ConstantExpr.
So I think it's doable, but beyond the scope of this change.
Some IR tests added here:
rL353992
I think that's better than IRBuilder tests. Feel free to add other predicates if you think they are needed, but I think those 4 give us good coverage.
lib/IR/ConstantFold.cpp | ||
---|---|---|
1364 | Can we assert that either V1 or V2 is a constant expression? // We do not know if a constant expression will evaluate to a number or NaN. // Therefore, we can only say that the relation is unordered or equal. |
Added comment, updated the new bitcast.ll tests to reflect the fix, and removed the superfluous IRBuilder tests.
I also moved the diff from rL to rG because the old LLVM Github mirror I was using stopped updating (https://github.com/llvm-mirror/llvm).
Can we assert that either V1 or V2 is a constant expression?
Either way, a bit more code comment (as suggested previously I think) would be helpful.
How about: