IsInductionVar currently looks for nsw properties for the induction variable, and this patch now adds the use of nuw. isKnownNonNegative has also been added as a helper function which uses isLoopEntryGuardedByCond to check that the value is greater or equal than zero. If this is valid, I will extract it out into SCEV's isKnownNonNegative.
Details
Diff Detail
Event Timeline
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
705 | I don't think that these two checks make any sense, because isLoopEntryGuardedByCond makes trivial checks anyways. You are just duplicating efforts. | |
712 | Are different values of IsSigned flag really needed here? :) You could've created zero constant just once. | |
936 | This does not feel right. What if the indvar goes from SINT_MAX - 10 to SINT_MIN + 10 and, thus, has signed wrap? In this case it might have nuw, but if we deal with signed predicates, we might miscompile. I think the correct approach would be to identify whether the latch predicate is signed or unsigned, and if it is unsigned, then we may check nuw instead of nsw. | |
972 | If these two changes (i.e. checking of nuw flag and advanced non-negative) check can be separated into 2 patches, plase do. |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
705 | It is strange, and maybe it is because of potential problem @ line 936. That is why I'd prefer these changes go separately. The part about isKnownNonNegative looks pretty safe, the part about nuw may be more fishy. |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
928 | Why not (Signed && HasNoSignedWrap(AR)) || (!IsSigned && AR->hasNoUnsignedWrap())? | |
929 | Actually I look into it and the old implementation seems suspicious to me. We only check nsw, but when we don't have nuw and actually do have an unsigned wrap, why don't we have a bug on unsigned latch predicates? Or we do, but not aware about?.. :) | |
929 | If you look at HasNoSignedWrap, it has a bit more complex check than just looking into the immediate flag, it also tries to go through sext to find this flag. Can we do the same for nuw and zext to keep things consistent? | |
964 | It seems to be an independent change, please commit it separately as NFC. | |
test/Transforms/IRCE/variable-loop-bounds.ll | ||
328 | Same, this can go as NFC without approval. | |
361 | Please add CHECK or CHECK-NOT on existence/non-existence of preloop and postloop, so that we can make sure that IRCE does good job generating/not generating them. Also the same applies to the tests above, you can add these checks as NFC separately (not super-urgent, but useful). |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
929 | I'm trying to understand test_02 of decrementing_loop. The test looks like it should work, but I can't figure out how the transform is making the decision. If I try to check nuw on unsigned and nsw on signed values, this test stops being transformed because it figures the indvar is NSW but ULT is being used. |
Hi Sam,
In test_02 in decrementing loop %idx starts from %len.a and then make a range check on %idx < %len.a which always fails on 1st iteration. In practice it means that we should have a preloop (in this particular case it means that we deoptimize on 1st iteration). So IRCE here basically makes no sense, but if it DOES happen (i.e. the compiler is not smart enough to understand its non-profitability), it MUST make a preloop.
It does not have nuw flag because of %len.a is zero then %idx.next = sub i32 %idx, 1 is calculated with passing from 0 to -1 which is an unsigned overflow, so this instruction does not have a nuw, so SCEV might also not set nuw for the addrec.
I think (I didn't check it, though) that if you insert a check %len.a != 0 above the loop and only go to the loop with this condition, it will have nuw. I hope it helps!
- Max
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
929 |
No, I mean something like for (i = -10; i <u 10; i++) This does have nsw but does not have a nuw because traversal from -1 to 0 is an unsigned wrap. However in this case we will fail on 1st iteration, so it sholdn't be a bug in practice, but it's still fishy. | |
929 | See my comment, I hope it will help. I think that either inserting the check len.a != 0 before the loop or not expecting that IRCE will handle this case will be OK. I would prefer the first option of it works. |
Hi Max,
Thanks, the reason why I don't understand why nuw isn't used is because the range data !{i32 0, i32 2147483647} is used for %len.a and the loop is only entered if %len.a != 0. The loop then exits when %idx < 2, so surely we should be able to discern that nuw is applicable? isKnownNonNegativeInLoop is also not helping in this case, but it would seem there's enough information here. Could it be that SCEV not be using the range metadata in conjunction with the entry and exit conditions?
thanks,
sam
Max,
After starting and playing around for the day, I've got to the root question of: why do we do the wrap checks anyway? Aren't the following range checks in the IsIncreasing/Decreasing blocks enough?
Wow, I've overlooked that it already has 1st iteration check. :) Well, we cannot always expect that SCEV is smart enough to do something, maybe in this particular case it simply failed to prove the non-negativity of the indvar. It's not a bug, SCEV is just now as smart as humans are. I will take a look into this case when I have some time, maybe we can improve the situation by teaching SCEV to derive nuw here. Feel free to take a look into it as well.
Just omitting the nsw check is clearly wrong. Look at the logic for ne predicates.
if (Pred == ICmpInst::ICMP_NE && LatchBrExitIdx == 1) // while (++i != len) { while (++i < len) { // ... ---> ... // } } // If both parts are known non-negative, it is profitable to use // unsigned comparison in increasing loop. This allows us to make the // comparison check against "RightSCEV + 1" more optimistic. if (SE.isKnownNonNegative(IndVarStart) && SE.isKnownNonNegative(RightSCEV)) Pred = ICmpInst::ICMP_ULT; else Pred = ICmpInst::ICMP_SLT;
This will not be an equivalent transform if i is greater than len on 1st iteration but then overflows and can still reach len after that. So with ne we will make a lot of iterations and then exit the loop, and with lt we will fail on the 1st iteration.
However in other cases, when predicate was slt/ult from the beginning, it's an interesting question whether we need this condition or not. I will give it some thought and run fuzzer testing to see what happens, it should either expose some problems which are not obvious or give a clue that you might be right and we don't need NoWrap flag check at all.
Interesting... I've made a dummy patch which lifts the restricton on nsw in cases when we are not dealing with eq/ne, and it passed 5000 complex fuzzed tests (which usually fail if there is some obvious bug). I also don't see a place where we would really need the no wrap other than eq/ne.
@samparker do you have a chance to prepare a patch that lifts this restriction? I can do it as well, but I'm a bit loaded atm. If we can get rid of no-wrap requirements at all, it will be a way better than using nuw. :)
Now that we're only looking for nsw with equality predicates, we should now be able to use nuw as well as nsw. HasNoSignedWrap has been renamed and queries hasNoSelfWrap, while still attempting to also search for nsw.
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
894 | This is incorrect. If counter value goes from -1 to (SINT_MAX + 1) (both included), it has both signed and unsigned wrap and has no self-wrap. |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
894 | Just to clarify the example: we start with i = -1 and go up with step of 1 and exit the loop when i = SINT_MAX + 1. In this case when we pass from -1 to 0 we are making an unsigned wrap and when we pass from SINT_MAX to SINT_MAX + 1 we are making a signed wrap. There is no self-wrap because this variable will never reach -1 again. |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
907 | Do you have a plan to do the same for zext? I'm OK if it's in a follow-up patch. |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
894 | It is not quite right. If either of nsw or nuw is set then nw is also present automatically. But nw can also be present of none of those is set, like in my example of iterations from -1 to -2^31 + 1. It crosses both signed and unsigned range borders but does not cross its initial value -1. |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
894 | I mean, it iterates from -1 to min_int + 1 with step 1, through overflow. |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
894 | Ah, ok, thanks. |
I don't think that these two checks make any sense, because isLoopEntryGuardedByCond makes trivial checks anyways. You are just duplicating efforts.