This is an archive of the discontinued LLVM Phabricator instance.

Fix modulo compared to zero
ClosedPublic

Authored by chrisj on May 10 2016, 5:51 PM.

Details

Summary

In case of modulo compared to zero, we need to do signed modulo operation as unsigned can give different results based on whether the dividend is negative or not.

This addresses PR27707.

Diff Detail

Repository
rL LLVM

Event Timeline

chrisj updated this revision to Diff 56843.May 10 2016, 5:51 PM
chrisj retitled this revision from to Fix modulo compared to zero.
chrisj updated this object.
chrisj set the repository for this revision to rL LLVM.
chrisj updated this object.May 10 2016, 5:59 PM
chrisj added a reviewer: grosser.
chrisj added a project: Restricted Project.
chrisj added a subscriber: pollydev.
chrisj added a reviewer: Unknown Object (User).May 10 2016, 6:01 PM
grosser edited edge metadata.May 10 2016, 10:51 PM

Hi Chris,

thank you for directly providing a patch for the issues you find!

Regarding this patch, I wonder what exactly is the issue you are seeing. My feeling is this might just be a lack of documentation on the ast expression nodes isl is generating. Specifically, in case isl generates a node isl_ast_op_zdiv_r, it has proven that the node is only ever compared against zero. So we can replace srem with urem as long as for all operands where srem yields zero, urem also yields zero and for operands where srem yields a non-zero result, urem also yields a non-zero result. The code was written under the assumption that this is the case.

In http://llvm.org/PR27707 you give the example "For example if (a1 - b1) = -48 (0xFFFFFFFFFFFFFFD0), the modulo with 24 would be 16". This seems to be still correct. Even though the result computed in %pexp.zdiv_r = urem i64 %25, 24 is different, the result of the comparision is again true for both cases: %26 = icmp eq i64 %pexp.zdiv_r, 0.

Could you clarify if this problem has ever introduced misbehavior in programs compiled with Polly.

Best,
Tobias

chrisj added a subscriber: chrisj.EditedMay 13 2016, 6:46 PM

Hi Tobi,

In http://llvm.org/PR27707 you give the example "For example if (a1 - b1) = -48 (0xFFFFFFFFFFFFFFD0), the modulo with 24 would be 16". This seems to be still correct. Even though the result computed in %pexp.zdiv_r = urem i64 %25, 24 is different, the result of the comparision is again true for both cases: %26 = icmp eq i64 %pexp.zdiv_r, 0.

For the mentioned example the signed remainder operation would result in 0 and while with unsigned remainder we get 16. Since isl_ast_op_zdiv_r is compared against zero, the polly.cond would take different paths based on the type of remainder operation.

We found that this caused some failures in tests where there were pointer iterations and the polly.cond had a check for modulo with a non-power of 2. Below is an example,

#include<stdio.h>
#include<stdlib.h>
typedef struct{
  int a;
  long long b;
  char c;
}A;
void   foo(A * a1, A * b1) {
  A *start;
  for (start = (void*) a1; start != b1; start++) {
    start->a += 2;
  }
}

A *a;
int main (){
  a = (A*) calloc (100, sizeof (A));
  foo (&a[0], &a[48]);
  if (a[0].a != 2)
    printf ("FAIL!\n");
  else
    printf ("PASS!\n");
  free (a);
  return 1;
}

Thanks,
Chris

Other examples:
-3 srem 3 = 0
-3 urem 3 = 0xFFFFFFFD rem 3 = 1

-1 srem 3 = -1
-1 urem 3 = 0xFFFFFFFF rem 3 = 0

lib/CodeGen/IslExprBuilder.cpp
281–287 ↗(On Diff #56843)

This is usually the domain of InstCombine, although I couldn't find a match for such cases in the current InstCombineMulDivRem.cpp

Hi Michael,

I synched up with Zino about the weekly polly discussion regarding this issue and to summarize,

  • Keep srem for isl_ast_op_zdiv_r
  • Have a llvm peep in InstCombine to optimize srem into urem

I will modify the patch to revert it back to srem for isl_ast_op_zdiv_r.

But wouldnt it be more easier to do the optimization in polly rather than pattern match in llvm, since in polly we know exactly that we are comparing against zero.

Thanks,
Chris

But wouldnt it be more easier to do the optimization in polly rather than pattern match in llvm, since in polly we know exactly that we are comparing against zero.

My argument is/was that the transformation is more general than only in that case. If the urem form is faster, we can optimize more code in InstCombine. It already does this transformation, eg. X srem Y -> X urem Y, iff X and Y don't have sign bit set.

The only additional work is to match whether the result is compared to zero, which we already know in zdiv_r. That shouldn't be a difficult check, In fact, InstCombine already checks where that result of an argument is used. The file InstCombineSimplifyDemanded.cpp is only about this. There is also a isKnownToBeAPowerOfTwo that does not only work on constants.

Please note that this is my subjective idea of separation of concerns and I don't have a strong opinion about it. If you find it too hard to get it into InstCombine, and a worthy transformation, just leave it as it is.

Hi Chris,

any update on this patch?

chrisj updated this revision to Diff 59151.May 31 2016, 4:43 PM
chrisj updated this object.
chrisj edited edge metadata.

Hi Tobi,

I pushed the patch. As per Michaels' suggestion I would do the transformation in llvm. In polly, I reverted it back to signed modulo for compare against zero for correctness.

Thanks,
Chris

Meinersbur accepted this revision.Jun 1 2016, 4:12 PM
Meinersbur added a reviewer: Meinersbur.
Meinersbur edited edge metadata.
This revision is now accepted and ready to land.Jun 1 2016, 4:12 PM
This revision was automatically updated to reflect the committed changes.