This is an archive of the discontinued LLVM Phabricator instance.

[Dependence Analysis] Fix ExactSIV producing wrong analysis
ClosedPublic

Authored by artemrad on Apr 12 2021, 11:02 AM.

Details

Summary

Symptom: ExactSIV test produced incorrect analysis of dependencies see LIT tests
Bug: At the end of the algorithm when determining dependence direction original author forgot to divide intermediate results by gcd and round result toward zero

Although this bug can be fixed with significantly fewer changes I opted to write the code in such a way that reflects the original algorithm that Banerjee proposed, for easier reference in the future. This surprisingly results in shorter code, and fewer quotient and max/min calculations.

Changes Summary:

  • fixed findGCD to return valid x and y so that they match the function description where: ax - by = gcd(a,b)
  • Fixed ExactSIV test, to produce proper results
  • Documented the extension of Banerjee's algorithm that the original code author introduced. Banerjee's original algorithm only tested whether Dst depends on Src, the extension also allows us to test whether Src depends on Dst, in one pass.
  • ExactRDIV test worked fine. Since it uses findGCD(), it needed to be updated.Since ExactRDIV test has very few changes from the core algorithm of ExactSIV I modified the test to have consistent format as ExactSIV.
  • Updated the LIT tests to be testing for correct values.

Diff Detail

Event Timeline

artemrad created this revision.Apr 12 2021, 11:02 AM
artemrad requested review of this revision.Apr 12 2021, 11:02 AM
artemrad updated this revision to Diff 337211.Apr 13 2021, 11:15 AM

Fixed clang-format issues

This change reverses almost all directions in the test cases. Was the implementation that terribly wrong?

Not having studies the used algorithms myself, I assumed the DA result to be correct and when using it as analysis assumed that flow [<] and anti [>] are anti-dependences (WAR).

llvm/lib/Analysis/DependenceAnalysis.cpp
1968–1970

This change reverses almost all directions in the test cases. Was the implementation that terribly wrong?

Not having studies the used algorithms myself, I assumed the DA result to be correct and when using it as analysis assumed that flow [<] and anti [>] are anti-dependences (WAR).

Hi Michael,

Was the implementation that terribly wrong?

No, but there is a pretty bad bug.

As for the assumption it is incorrect. In literature, when talking about two statements src and dst, [<] indicates src occurs before dst, and [>] indicates dst occurs before src. Which further implies the following statements:
flow [<] is RAW
anti [>] is RAW
flow [>] is WAR
anti [<] is WAR
This is corroborated by https://dl.acm.org/doi/pdf/10.1145/113446.113448 (see section 1.3)

Now this is also not just a case of LLVM differing from the academic community in its definitions. Let me demonstrate:

  1. In the first test of StrongSIV.ll (Note: I did not modify this test)
	;;  for (int i = 0; i < n; i++) {
	;;    A[i + 2] = i;
	;;    *B++ = A[i];

We get the following result from DA:

	Src:  store i32 %1, i32* %arrayidx, align 4 --> Dst:  %2 = load i32, i32* %arrayidx3, align 4
	  da analyze - consistent flow [2]!

Which shows that (+) positive distance in LLVM means src happens before dst

  1. From the following lines in DependenceAnalysis.cpp
	    if (Distance.sgt(0))
	      Result.DV[Level].Direction &= Dependence::DVEntry::LT;
	    else if (Distance.slt(0))
	      Result.DV[Level].Direction &= Dependence::DVEntry::GT;
	    else
	      Result.DV[Level].Direction &= Dependence::DVEntry::EQ;

We see that DA establishes the following relationship positive values -> [<], and negative values -> [>]. Combining this with (1), we see that LLVM's definition aligns with the literature's definition.

  1. Now let's take a look at one of the lit tests that were modified by my change, say exact3 from ExactSIV.ll
	;;  for (long unsigned i = 0; i <= 10; i++) {
	;;    A[6*i] = i;
	;;    *B++ = A[i + 60];
	
	Src:  store i32 %conv, i32* %arrayidx, align 4 --> Dst:  %0 = load i32, i32* %arrayidx1, align 4
	  da analyze - flow [>]!

Here we can clearly see that the load in dst (i=0) comes before the store in src(i=10), hence [>].

Hopefully this shows that there is a bug in Dependence Analysis.

Most of the changes in DA results look good to me, except a few I've noted. Perhaps those can be improved in the future, since this patch is making an improvement overall.

llvm/test/Analysis/DependenceAnalysis/Coupled.ll
253

this one was wrong before, and is still wrong but in a different way....same thing with couple14.

llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll
19

The <= part is correct, but I'm not sure about the |< part. The dependence is loop carried for the most part (except when i = 9). I think this might be because we treat the existence of an EQ as signifying loop independent dependence, which should probably be reconsidered in a separate patch.

Than you for pointing this out. It would explain my confusion, as I would also would have expected the direction to follow the literature but the tests in ExactSIV.ll seemed to use the opposite direction.

I will need to review the UnrollAndJam legality check again since I may have (helped) writing it using my wrong assumption. Its tests do not fail with this change they might have been corrected by extensive testing.

artemrad updated this revision to Diff 340116.Apr 23 2021, 11:21 AM
artemrad marked 2 inline comments as done.

Fixed the style comments, and updated the academic reference for the algorithm to more closely match the implementation.

Meinersbur added inline comments.Apr 25 2021, 4:49 PM
llvm/test/Analysis/DependenceAnalysis/Coupled.ll
253

This is marked as done, but I don't see being it addressed.

llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll
19

This is marked as done, but I don't see being it addressed.

artemrad marked 2 inline comments as not done.Apr 26 2021, 6:31 AM

Sorry @Meinersbur, I responded to the comments but forgot to submit them. I am a new to Phabricator. Here they are now.

llvm/test/Analysis/DependenceAnalysis/Coupled.ll
253

There is only one index in which we are accessing the same memory location i=3, so the [=] is correct.

As for the |<, please see my response to your other comment.

llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll
19

You are correct, that is exactly the reason for |<. I have another patch down the line to fix this confusing notation.

Meinersbur accepted this revision.Apr 26 2021, 2:34 PM

I am approving this patch seeing that it is a significant improvement.

For exact0, you both seem to agree that it should be handled in a separate patch.

llvm/test/Analysis/DependenceAnalysis/Coupled.ll
554

Looks correct. Only intersection of accessed elements is when 3*i-18 = 18-i = i, which solves to i = 9.

llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll
19

|< is printed when isLoopIndependent() returns true. If "loop independent" means the absence of loop-carried dependences, then |< is wrong; i=5..8 are loop-carried.

What is your improved notation?

This revision is now accepted and ready to land.Apr 26 2021, 2:34 PM