Fixes https://github.com/llvm/llvm-project/issues/63896 and https://github.com/rust-lang/rust/issues/113757.
This patch adds facts implied by llvm.smin/smax/umin/umax intrinsics.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 | I don't think this is right. It does not correctly represent where the fact will apply. This should be rooted at a branch/assume, just like the normal icmp handling. Likely the fact in the worklist should just be the icmp, and we should only handle the min/max when adding it to the constraint system. | |
llvm/test/Transforms/ConstraintElimination/minmax.ll | ||
28 | Needs more test coverage for different min/max and different predicates. |
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 | Hmm not sure if it is actually possible to show a miscompile with the above. I think one way to handle this would be to inject I <= I->getOperand(0), I <= I->getOperand(1) as facts here. That leaves the question on how to best synthesize such conditions here. The simplest way would be to create temporary ICMP instructions. Not sure what other people think about that though and if we need a more local/lightweight representation for conditions. |
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 | I think the triple (ICmpInst::Predicate Pred, Value* Lhs, Value* Rhs) is better than (ICmpInst* Inst, bool Not) to represent a fact. |
- Rebase
- Add facts with two temporary ICmpInsts for MinMaxIntrinsic
- Add more tests with different predicates and minmax intrinsics.
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 | I don't understand why we would want to inject facts at this point at all. We already have a fact for the icmp involving the min/max. Everything else can be handled when inserting that fact into the constraint system. |
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 | I want to simplify min/max intrinsics in ConstraintElim. https://alive2.llvm.org/ce/z/2RQVy5 define i32 @src(i32 noundef %x, i32 noundef %y, i32 noundef %z) { entry: %cmp = icmp slt i32 %x, %y br i1 %cmp, label %if, label %end if: %max1 = call i32 @llvm.smax.i32(i32 %x, i32 %z) %max2 = call i32 @llvm.smax.i32(i32 %y, i32 %max1) ret i32 %max2 end: ret i32 0 } define i32 @tgt(i32 noundef %x, i32 noundef %y, i32 noundef %z) { entry: %cmp = icmp slt i32 %x, %y br i1 %cmp, label %if, label %end if: %max1 = call i32 @llvm.smax.i32(i32 %y, i32 %z) ret i32 %max1 end: ret i32 0 } declare i32 @llvm.smax.i32(i32, i32) This transformation cannot be handled by InstCombine. |
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 |
Yep, both are possibilities. If we insert them when they are used at a compare, we would probably need to do it driven by the decomposition logic so we catch cases where the umax is used by more complex expressions. This probably will end up slightly more complicated code-wise, but the advantage would be that we only need to add the additional facts when they are actually used. Queuing them here directly is probably simpler overall in terms of code at the cost of adding them unnecessarily in some cases . When we add the facts when handling the compares, we may add the same facts multiple times if the min/max is used in multiple places on the other hand. I think both approaches are fine, it would be good to see if they can be added elegantly directly when simplifying the compares |
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 |
This seems to be orthogonal to the current patch. This would require inserting a check for the min/max, not a fact. Simplifying the min/max itself should be pretty straightforward, with the same basic approach as we have for with.overflow intrinsics. |
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 | In the above case, we will miss this optimization when just adding facts implied by min/max iff they are used by icmp. |
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 | Ugh, I completely misunderstood what this patch is doing. Ignore everything I've said above. I assume the reason why we can add the fact at the "wrong" position (start of the block) is that it only becomes meaningful once the value is defined. My remaining question here would be whether we can handle min/max similarly to assume, i.e. just push the min/max instruction as the "fact" and then decompose it into the two conditions as part of eliminateConstraints(). At that point we no longer need actual icmp instructions. |
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 | That is what this patch used to do. |
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 |
Indeed! I think your first version was the right way to do this. Sorry for all the confusion I caused.
It seems like this should be easy to avoid by storing Pred + LHS + RHS in ReproducerEntry instead of the CmpInst. The reproducer generation doesn't need an actual instruction (this would allow us to get rid of the awkward IsNot flag as well). |
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
788 |
Would probably be good to do this cleanup separately first. | |
llvm/test/Transforms/ConstraintElimination/minmax.ll | ||
2 | could you add the tests as a separate patch and then only include the improved check lines in this patch? | |
17 | would be good to also have tests with different second args and conditions also checking the second arg and perhaps some tests with more complicated expressions using the result of the umax (e.g. use it in an add that's then compared) | |
22 | it would probably be good to have tests with signed predicates and other combinations as well |
Implementation looks good to me.
llvm/lib/Transforms/Scalar/ConstraintElimination.cpp | ||
---|---|---|
1380 | Unnecessary braces | |
llvm/test/Transforms/ConstraintElimination/minmax.ll | ||
32–33 | These tests could be more compact by testing both ugt and uge inside the if. The standard pattern in ConstraintElim tests seems to be to combine multiple conditions with xor i1. | |
364 | I think you're current missing a test for mixing signed and unsigned predicates. I'd also suggesting to test something like x pred min(x, y), where there is no branch involved, and you're just directly using the fact implied by the min/max. |
LGTM, thanks! Please make sure there also are tests that use mixed signed & unsigned predicates as per @nikic's comment. I could't spot one in the latest version but maybe I missed it.
I think you're current missing a test for mixing signed and unsigned predicates.
in function smin_ule_mixed at line 255
I'd also suggesting to test something like x pred min(x, y), where there is no branch involved, and you're just directly using the fact implied by the min/max.
in function smin_branchless at line 319
llvm/test/Transforms/ConstraintElimination/minmax.ll | ||
---|---|---|
364 |
|
I don't think this is right. It does not correctly represent where the fact will apply. This should be rooted at a branch/assume, just like the normal icmp handling.
Likely the fact in the worklist should just be the icmp, and we should only handle the min/max when adding it to the constraint system.