This is an archive of the discontinued LLVM Phabricator instance.

Add ability to use DependenceAnalysis from LoopAccessAnalysis
Needs ReviewPublic

Authored by hfinkel on May 3 2016, 12:53 PM.

Details

Summary

LoopAccessAnalysis performs dependence analysis for our loop vectorization and other loop transformation passes. Its builtin dependence analysis is fairly simple, and ends up missing a number of common cases. As a result, we end up inserting runtime checks where they're not really needed (or perhaps not vectorizing at all).

LLVM has a more-sophisticated DependenceAnalysis implementation which does catch more of these cases. This patch adds a mode, disabled by default, under which we can use DependenceAnalysis from LoopAccessAnalysis. I've included an example test case in which the use of DependenceAnalysis allows us to elide unnecessary runtime checks.

We should discuss whether we should have, as a general refactoring goal, having LoopAccessAnalysis use DependenceAnalysis for all of its dependence-analysis needs. And if not, why not.

Diff Detail

Event Timeline

hfinkel updated this revision to Diff 56045.May 3 2016, 12:53 PM
hfinkel retitled this revision from to Add ability to use DependenceAnalysis from LoopAccessAnalysis.
hfinkel updated this object.
hfinkel added a subscriber: llvm-commits.
anemet edited edge metadata.May 3 2016, 2:06 PM

This is great! I will look at this but quickly, did you see any performance gains with this?

This is great! I will look at this but quickly, did you see any performance gains with this?

Thanks! I've not run the test suite, etc. yet. I was looking at the performance of a loop that follows the pattern in the test case (reading from one part of an array and writing to another part), and this patch definitely helps that case.

Overall, this looks like a very good goal.

The only problem that I can see would be that the Dependence Analysis won't be able to use the extra run-time information from loop versioning. In these cases, the current built-in logic might get a NoDep (there probably is a way to fix this).

etherzhhb added inline comments.
lib/Analysis/LoopAccessAnalysis.cpp
1245

In order to address @sbaranga's comment, perhaps we can rewrite Dist with SCEVPredicateRewriter using the existing Predicate from PSE to incorporate the existing runtime checks.

sbaranga added inline comments.May 5 2016, 8:12 AM
lib/Analysis/LoopAccessAnalysis.cpp
1227

DA is independent of the loop versioning. Therefore if we have NoDep here then the versioning performed by replaceSymbolicStrideSCEV/isStridedPtr wouldn't help. So by moving this above the replaceSymbolicStrideSCEV calls, we could reduce the number of run-time checks needed.

1245

Yes, something like that can be used to keep the existing features of the LAA dependece analysis, but it wouldn't help if we wanted to refactor and replace this logic entirely with DependenceAnalysis.

What are we getting by using the Dep->getDistance? It might be better to just keep the old distance calculation (which should be equivalent and avoid additional SCEV rewritting).

etherzhhb added inline comments.
lib/Analysis/LoopAccessAnalysis.cpp
1227

Yes, we can call DA early to get the distance SCEV early and avoid the later replaceSymbolicStrideSCEV.

If the result provided by DA is more precise, it is also useful in this stage. See below.

1245

If DependenceAnalysis (and later the analysis from Polly, see one of this year's GSoC mentored by @jdoerfert) provide more precise result, and we can rewirte the SCEV returned by DependenceAnalysis with the predicates introduced by LoopAccessAnalysis to provide even more precise results, we may have some gain here.

sbaranga added inline comments.May 9 2016, 9:14 AM
lib/Analysis/LoopAccessAnalysis.cpp
1245

That should be fine in the long run (and looks like it will have to be touched in the GSoC work), my only concern was that it might not make sense to do this it in the current change if it doesn't bring any gain (it would be over-engineering?).

Anyway, if we do want to use the distance returned by the DependenceAnalysis, we would have to do two things:

  • we would need to record this distance in the dependences produced by LLA, since other passes consume this (see isDependenceDistanceOfOne in LLE) and are currently assuming that the distance is computed by subtracting the SCEVs. There might be other examples that I'm not aware of.
  • add the boilerplate to PSE to version the SCEV directly (should be simple).

It would also be really nice if we could figure out a good (and common) interface for the dependence result (which would be part of the refactoring goal anyway).

hfinkel updated this revision to Diff 63030.Jul 6 2016, 9:44 PM
hfinkel edited edge metadata.

Rebased (no other changes yet)