In case of modulo compared to zero, we need to do signed modulo operation as unsigned can give different results based on whether the dividend is negative or not.
This addresses PR27707.
Differential D20145
Fix modulo compared to zero chrisj on May 10 2016, 5:51 PM. Authored by
Details In case of modulo compared to zero, we need to do signed modulo operation as unsigned can give different results based on whether the dividend is negative or not. This addresses PR27707.
Diff Detail
Event TimelineComment Actions Hi Chris, thank you for directly providing a patch for the issues you find! Regarding this patch, I wonder what exactly is the issue you are seeing. My feeling is this might just be a lack of documentation on the ast expression nodes isl is generating. Specifically, in case isl generates a node isl_ast_op_zdiv_r, it has proven that the node is only ever compared against zero. So we can replace srem with urem as long as for all operands where srem yields zero, urem also yields zero and for operands where srem yields a non-zero result, urem also yields a non-zero result. The code was written under the assumption that this is the case. In http://llvm.org/PR27707 you give the example "For example if (a1 - b1) = -48 (0xFFFFFFFFFFFFFFD0), the modulo with 24 would be 16". This seems to be still correct. Even though the result computed in %pexp.zdiv_r = urem i64 %25, 24 is different, the result of the comparision is again true for both cases: %26 = icmp eq i64 %pexp.zdiv_r, 0. Could you clarify if this problem has ever introduced misbehavior in programs compiled with Polly. Best, Comment Actions Hi Tobi,
For the mentioned example the signed remainder operation would result in 0 and while with unsigned remainder we get 16. Since isl_ast_op_zdiv_r is compared against zero, the polly.cond would take different paths based on the type of remainder operation. We found that this caused some failures in tests where there were pointer iterations and the polly.cond had a check for modulo with a non-power of 2. Below is an example, #include<stdio.h> #include<stdlib.h> typedef struct{ int a; long long b; char c; }A; void foo(A * a1, A * b1) { A *start; for (start = (void*) a1; start != b1; start++) { start->a += 2; } } A *a; int main (){ a = (A*) calloc (100, sizeof (A)); foo (&a[0], &a[48]); if (a[0].a != 2) printf ("FAIL!\n"); else printf ("PASS!\n"); free (a); return 1; } Thanks, Comment Actions Other examples: -1 srem 3 = -1
Comment Actions Hi Michael, I synched up with Zino about the weekly polly discussion regarding this issue and to summarize,
I will modify the patch to revert it back to srem for isl_ast_op_zdiv_r. But wouldnt it be more easier to do the optimization in polly rather than pattern match in llvm, since in polly we know exactly that we are comparing against zero. Thanks, Comment Actions My argument is/was that the transformation is more general than only in that case. If the urem form is faster, we can optimize more code in InstCombine. It already does this transformation, eg. X srem Y -> X urem Y, iff X and Y don't have sign bit set. The only additional work is to match whether the result is compared to zero, which we already know in zdiv_r. That shouldn't be a difficult check, In fact, InstCombine already checks where that result of an argument is used. The file InstCombineSimplifyDemanded.cpp is only about this. There is also a isKnownToBeAPowerOfTwo that does not only work on constants. Please note that this is my subjective idea of separation of concerns and I don't have a strong opinion about it. If you find it too hard to get it into InstCombine, and a worthy transformation, just leave it as it is. Comment Actions Hi Tobi, I pushed the patch. As per Michaels' suggestion I would do the transformation in llvm. In polly, I reverted it back to signed modulo for compare against zero for correctness. Thanks, |