If we decide to version a loop to benefit a transformation, it makes
sense to record the now non-aliasing accesses in the newly versioned
loop. This allows non-aliasing information to be used by subsequent
One example is 456.hmmer in SPECint2006 where after loop distribution,
we vectorize one of the newly distributed loops. To vectorize we
version this loop to fully disambiguate may-aliasing accesses. If we
add the noalias markers, we can use the same information in a later DSE
pass to eliminate some dead stores which amounts to ~25% of the
instructions of this hot memory-pipeline-bound loop. The overall
performance improves by 18% on our ARM64.
The scoped noalias annotation is added in LoopVersioning. The patch
then enables this for loop distribution. A follow-on patch will enable
it for the vectorizer. Eventually this should be run by default when
versioning the loop but first I'd like to get some feedback whether my
understanding and application of scoped noalias metadata is correct.
Essentially my approach was to have a separate alias domain for each
versioning of the loop. For example, if we first version in loop
distribution and then in vectorization of the distributed loops, we have
a different set of memchecks for each versioning. By keeping the scopes
in different domains they can conveniently be defined independently
since different alias domains don't affect each other.
As written, I also have a separate domain for each loop. This is not
necessary and we could save some metadata here by using the same domain
across the different loops. I don't think it's a big deal either way.
Probably the best is to review the tests first to see if I mapped this
problem correctly to scoped noalias markers. I have plenty of comments
in the tests.
Note that the interface is prepared for the vectorizer which needs the
annotateInstWithNoAlias API. The vectorizer does not use LoopVersioning
so we need a way to pass in the versioned instructions. This is also
why the maps have to become part of the object state.
Also currently, we only have an AA-aware DSE after the vectorizer if we
also run the LTO pipeline. Depending how widely this triggers we may
want to schedule a DSE toward the end of the regular pass pipeline.