as well as `[zs]ext(C + x + ...) -> (D + [zs]ext(C-D + x + ...))<nuw><nsw>`

if the top level addition in `(D + (C-D + x * n))` could be proven to

not wrap, where the choice of `D` also maximizes the number of trailing

zeroes of `(C-D + x * n)`, ensuring homogeneous behaviour of the

transformation and better canonicalization of such `AddRec`'s

(indeed, there are `2^(2w)` different expressions in `B1 + ext(B2 + Y)` form for

the same `Y`, but only `2^(2w - k)` different expressions in the resulting

`B3 + ext((B4 * 2^k) + Y)` form, where `w` is the bit width of the integral type)

The `AddExpr` version of the transformation enables better canonicalization

of expressions like

1 + zext(5 + 20 * %x + 24 * %y) and zext(6 + 20 * %x + 24 * %y)

which get both transformed to

2 + zext(4 + 20 * %x + 24 * %y)

This pattern is common in address arithmetics and the transformation

makes it easier for passes like `LoadStoreVectorizer` to prove that 2 or

more memory accesses are consecutive and optimize (vectorize) them.

I found this change similar to a number of other changes to Scalar Evolution, namely:

commit 63c52aea76b530d155ec6913d5c3bbe1ecd82ad8 Author: Sanjoy Das <sanjoy@playingwithpointers.com> Date: Thu Oct 22 19:57:38 2015 +0000 [SCEV] Commute zero extends through <nuw> additions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251052 91177308-0d34-0410-b5e6-96231b3b80d8

commit 3edd5bf90828613bacfdc2ce047d3776363123e5 Author: Justin Lebar <jlebar@google.com> Date: Thu Jun 14 17:13:48 2018 +0000 [SCEV] Simplify zext/trunc idiom that appears when handling bitmasks. Summary: Specifically, we transform zext(2^K * (trunc X to iN)) to iM -> 2^K * (zext(trunc X to i{N-K}) to iM)<nuw> This is helpful because pulling the 2^K out of the zext allows further optimizations. Reviewers: sanjoy Subscribers: hiraditya, llvm-commits, timshen Differential Revision: https://reviews.llvm.org/D48158

and the most relevant

commit 45788be6e2603ecfc149f43df1a6d5e04c5734d8 Author: Michael Zolotukhin <mzolotukhin@apple.com> Date: Sat May 24 08:09:57 2014 +0000 Implement sext(C1 + C2*X) --> sext(C1) + sext(C2*X) and sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformation in Scalar Evolution. That helps SLP-vectorizer to recognize consecutive loads/stores. <rdar://problem/14860614> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209568 91177308-0d34-0410-b5e6-96231b3b80d8

This patch generalizes the latter one by relaxing the requirements the following way:

`C2`doesn't have to be a power of 2, it enough if it's divisible by 2 a sufficient number of times;`C1`doesn't have to be less than`C2`, instead of extracting the entire C1 we can split it into 2 terms: (`00...0XXX`+`YY...Y000`), keep the second one that may cause wrapping within the extension operator, and move the first one that doesn't affect wrapping out of the extension operator, enabling further simplifications;`C1`and`C2`don't have to be positive, splitting`C1`like shown above produces a sum that is guaranteed to not wrap, signed or unsigned;- in
`AddExpr`case there could be more than 2 terms, and in case of`AddExpr`the 2nd and following terms and in case of`AddRecExpr`the`Step`component don't have to be in the`C2*X`form or constant (respectively), they just need to have enough trailing zeros, which in turn could be guaranteed by means other than arithmetics, e.g. by a pointer alignment; - the extension operator doesn't have to be a sext, the same transformation works and profitable for zext's as well.

Apparently, optimizations like `SLPVectorizer` currently fail to

vectorize even rather trivial cases like the following:

double bar(double *a, unsigned n) {

double x = 0.0; double y = 0.0; for (unsigned i = 0; i < n; i += 2) { x += a[i]; y += a[i + 1]; } return x * y;

}

If compiled with `clang -std=c11 -Wpedantic -Wall -O3 main.c -S -o - -emit-llvm`

(`!{!"clang version 7.0.0 (trunk 337339) (llvm/trunk 337344)"}`)

it produces scalar code with the loop not unrolled with the unsigned `n` and

`i` (like shown above), but vectorized and unrolled loop with signed `n` and

`i` (follow https://godbolt.org/g/nq9xF8 to play with it).

With the changes made in this commit the unsigned version will be

vectorized (though not unrolled for unclear reasons).