The default value of MaxLookup is 6, which limits the number of instructions to be stripped off when getting the underlying object. 'Loop Unroll' and 'Loop Strength Reduction' passes trend to replace the memory index 'base + i * offset' to 'base, base += offset, base += offset'. It increases the depth of the underlying object so that pointsToConstantMemory may fail to identify a pointer's underlying object is a NoAlias and ReadOnly Argument. It leads to false memory load scheduling dependency and prevents the instruction scheduler to pipeline the memory load operations.
The default MaxLookup is too small for this case. In general, even the first memory load has a GEP and a bitcast. The false memory dependency begins from the fifth memory load for a global memory (argument). It causes less efficient assembly codes.a
Specifying a larger MaxLookup value or even an unlimited MaxLookup (i.e., 0) solves this problem.
clang-format not found in user's PATH; not linting file.