This patch is planned to include 2 commits:

The first commit:

Revert "[SCEV][NFC] Check NoWrap flags before lexicographical comparison of SCEVs"

This reverts r319889.

Unfortunately, wrapping flags are not a part of SCEV's identity (they

do not participate in computing a hash value or in equality

comparisons) and in fact they could be assigned after the fact w/o

rebuilding a SCEV.

Grep for const_cast's to see quite a few of examples, apparently all

for AddRec's at the moment.

So, if 2 expressions get built in 2 slightly different ways: one with

flags set in the beginning, the other with the flags attached later

on, we may end up with 2 expressions which are exactly the same but

have their operands swapped in one of the commutative N-ary

expressions, and at least one of them will have "sorted by complexity"

invariant broken.

2 identical SCEV's won't compare equal by pointer comparison as they

are supposed to.

A real-world reproducer is added as a regression test: the issue

described causes 2 identical SCEV expressions to have different order

of operands and therefore compare not equal, which in its turn

prevents LoadStoreVectorizer from vectorizing a pair of consecutive

loads.

On a larger example (the source of the test attached, which is a

bugpoint) I have seen even weirder behavior: adding a constant to an

existing SCEV changes the order of the existing terms, for instance,

getAddExpr(1, ((A * B) + (C * D))) returns (1 + (C * D) + (A * B)).

The second commit:

[SCEV] Compare SCEVs by a complexity rank before lexic-al comparison

after "important" ordering (by type of the SCEV node and by dominance

relation on Loops for AddRec's) is done.

The exact order of operands of associative and commutative SCEV

operators (add, mul, smax, and umax) does not matter but needs to be

consistent. Currently it's resolved by a deep lexicographical

comparision which could be expensive.

This commit adds a small hash value to every SCEV node which is

supposed to have the following properties:

- Be equal for SCEVs of the same lexicographical complexity
- Collide for SCEVs of different complexities as rarely as possible
- Increase in its numerical value for small SCEVs with the increase in their complexity in a sensible way to aid debugging and pretty-printing.

The hash is rougly a polynomial hash with one major difference: it

only adds up the hash values of operands of commutative and

associative operators. This way the complexity of such operators

does not depend on order of the operands, which is rather expected,

and the hash value does not wrap around as quickly as otherwise,

making sure the property (3) holds for larger SCEVs.

CompareSCEVComplexity is also changed so it compares SCEVs by the hash

(called complexity rank) after the "important" checks and resorts to a

full lexicographical comparision only if the ranks are the same.

Expected increase in memory usage is zero, including 32-bit

architectures.

The performance impact is measured on

`large-SCEVs-shallow-getSCEV-stack.ll`

from https://bugs.llvm.org/show_bug.cgi?id=32731

with https://reviews.llvm.org/D50985 patch applied and

ScalarEvolution::checkValidity disabled on an x86 machine (100 runs total, ms):

Before | After by W Flags | After by NC Rank | |
---|---|---|---|

CompareSCEVComplexity | 2,620 | 2,590 | 84 |

GroupByComplexity | 4,280 | 4,310 | 3,100 |

createSCEV | 34,180 | 34,100 | 32,840 |

The major goal of this patch is to make sure we don't have a compile

time regression due to revert of r319889 ("[SCEV][NFC] Check NoWrap

flags before lexicographical comparison of SCEVs")