This implements an optimization described in Hacker's Delight 10-17: when C is constant, the result of X % C == 0 can be computed more cheaply without actually calculating the remainder. The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479.
Original patch author (Dmytro Shynkevych) Notes:
- In principle, it's possible to also handle the X % C1 == C2 case, as discussed on bugzilla. This seems to require an extra branch on overflow, so I refrained from implementing this for now.
- An explicit check for when the REM can be reduced to just its LHS is included: the X % C == 0 optimization breaks test1 in test/CodeGen/X86/jump_sign.ll otherwise. I hadn't managed to find a better way to not generate worse output in this case.
- I haven't contributed to LLVM before, so I tried to select reviewers based on who I saw in other reviews. In particular, @kparzysz: a Hexagon test is modified and I have no familiarity with the architecture; hopefully my changes are valid.
Please can you not include the Divisor - we are aiming for full vector support for divisions, not just splats - see D50185