The results of this and the other 2 patches look promising. Some initial benchmarks show:
sortMemAccesses() is analogous to the formation of InterleaveGroups in the LoopVectorizer, which also scans a collection of Loads (or Stores) to determine if they are adjacent in some order and can be combined into one Vector Load of a given width; and if so, in what order. This requires a single scan to compute the distances relative to the first access, as done here. But knowing that we're looking for a permutation of a given width, we can more easily sort the accesses as they are entered into a map, holding the minimum and maximum indices. See insertMember() there.
Remove TODO from test case
add check for declaration to avoid post-link transformation of pipe functions.
This patch segfaults in the following three tests:
refactor AMDGPULibFunc to handle unmangled lib functions.
This patch adds support for -polly-mse to the new PassManager infrastructure.
Trim the patch and avoid handling stack references for non-value MMOs to prevent LNT test failures.
Ping. @NoQ would you please have a look? Thanks!
Results with the patch.
Updated the patch as per Simon's comments.
Added the FP instruction itineraries which includes SSE4A and SHA instructions.
Updated the patch to use LVILatticeValue (which I moved to a separate header file for now, I'd move it out in a separate patch if we decide to use it) for tracking function parameter values and updated tryToReplaceWithConstant to use range information to removed ICmp instructions.