This is an archive of the discontinued LLVM Phabricator instance.

[DA] Delinearization of fixed-size multi-dimensional arrays
ClosedPublic

Authored by bmahjour on Jan 3 2020, 1:14 PM.

Details

Summary

Currently the dependence analysis in LLVM is unable to compute accurate dependence vectors for multi-dimensional fixed size arrays. For example:

#define N 1024
#define M 2048
void foo(int a[N][M]) {                                                                                                                                                                                                                                                                                                                                                                                                               
  for (long i = 0; i < N-1; ++i)                                                                                                                                                                                      
    for (long j = 2; j < M; ++j)                                                                                                                                                                                      
      a[i][j] = a[i+1][j-2];

gives the following output for the dependence between the load of a[i+1][j-2] and the store into a[i][j]:

da analyze - anti [< >]!

While the direction vectors are correct, no dependence distances are computed, as we expect:

da analyze - anti [1 -2]!

This is mainly because the delinearization algorithm in scalar evolution relies on parametric terms to be present in the access functions. In the case of fixed size arrays such parametric terms are not present, but we can use the indexes from GEP instructions to recover the subscripts for each dimension of the arrays.

It appears that https://reviews.llvm.org/D35430 removed the capability to look at GEP instructions early in 2018. The justification for that change appears to be related to the concern over subscripts overlapping into next array dimensions in languages like C/C++. Please note that https://reviews.llvm.org/D62610 added a debug option to address this concern.

Removing support for fixed-size array delinearization resulted in over-pissimization of DependenceAnalysis and reduction of test coverage for both DA and LoopInterchange.

I've also noticed that polly tries to recover fixed-size array subscripts for dependence testing as well. In this patch I add support for fixed-size array delinearization (using the same mechanism that polly uses) under the option introduced in D62610. Two follow up revisions succeed this patch:

  1. In D73995 I've refactored the polly code and moved it to scalar-evolution to make it reuseable by both polly and DA.
  2. In D73998 I've renamed the option to reflect the fact that it controls more than just the delinearization validity checks.

Quite a few DA and loop interchange test cases improve as a result of this change, and it makes writing tests for new and existing transformations easier.

Diff Detail

Event Timeline

bmahjour created this revision.Jan 3 2020, 1:14 PM

Could people on the review list please take a look at this patch? I would also appreciate suggestions for who should be a reviewer instead (or in addition) to the current list of reviewers, if there are any.

fhahn added a comment.Feb 3 2020, 9:45 AM

I think it would be good to split this up into the 3 distinct parts you mention in the description:

  1. re-introduce the checks
  2. rename option
  3. improve logic
llvm/lib/Analysis/DependenceAnalysis.cpp
120

What is the rational for renaming the option? Enabling the option is unsafe and the new name makes it sound a lot more harmless than the original one. Also I think we should retain the wording in the description stating why it is unsafe.

llvm/test/Analysis/DependenceAnalysis/Separability.ll
2 ↗(On Diff #236114)

Not sure if it's the best idea to enable this option for a bunch of tests, as we miss test coverage for the code path actually enabled by default. I think the runs with -da-assume-inrange-subscripts should be an addition, rather than replacing the existing one.

bmahjour marked 2 inline comments as done.Feb 3 2020, 11:34 AM
bmahjour added inline comments.
llvm/lib/Analysis/DependenceAnalysis.cpp
120

I renamed it because the option is now being used to control more than just the delinearization validity checks. Now it also controls whether fixed size array delinearization happens or not. I can add the words of caution from the old description back.

llvm/test/Analysis/DependenceAnalysis/Separability.ll
2 ↗(On Diff #236114)

Unfortunately that will cause dual maintenance. IMHO the loss of coverage suffered as a result of disabling delinearization for the old tests are more concerning than missing coverage on inaccurate results of the current default. Having said that I see your point in that we may want to make sure the current default results don't get any more inaccurate than they already are.

I really don't see a point in dual maintaining the loop interchange tests though, given that the transform is off by default and there is no point in testing that it fails due to data dependence when there is no real data dependence. Agreed?

bmahjour updated this revision to Diff 242428.EditedFeb 4 2020, 2:21 PM

@fhahn As per your request I've reduced this patch to just adding support for delinearization of fixed-size arrays. Also restored the default tests and added new ones for when the option is enabled. Two more patches are created to 1. do the refactoring (D73995) 2. rename the option (D73998).

bmahjour edited the summary of this revision. (Show Details)Feb 4 2020, 2:42 PM
Meinersbur added inline comments.Feb 17 2020, 12:51 PM
llvm/include/llvm/Analysis/DependenceAnalysis.h
932–949

Could you add some more details that it tries to derive the dimensions from GEP and tryDelinearizeParametricSize tries to do delinearization? I mean, that's the reason why these functions are separate.

llvm/lib/Analysis/DependenceAnalysis.cpp
3276–3327

Please add a comment why this is bailing out here.

We could also check for GEP's inrange modifier and require it unless DisableDelinearizationChecks is set.

3311–3314

I think this has been split into D73995.

3354–3355

[style] LLVM style does not use almost-always-auto

3435
llvm/test/Analysis/DependenceAnalysis/Separability.ll
2 ↗(On Diff #236114)

I second @fhahn's remark. IMHO regression tests for something unsafe does not happen (here: delinearization without proven safety) are even more important.
Regarding maintenance: I too wish that we had something more maintainable than FileCheck.

llvm/test/Transforms/LoopInterchange/currentLimitation.ll
1–3

I think we should continue to also check that interchange does not happen without -da-disable-delinearization-checks. Could you add a separate RUN line with another -check-prefix=? This also applies to other test cases.

fhahn added a comment.Feb 17 2020, 1:29 PM

From the description

Removing support for fixed-size array delinearization resulted in over-pissimization of DependenceAnalysis and reduction of test coverage for both DA and LoopInterchange.

I'm not sure it over-pessimized DA. The change was required for correctness.

llvm/test/Analysis/DependenceAnalysis/Separability.ll
2 ↗(On Diff #236114)

I think it might also be beneficial to just extract the interesting cases into a separate test file, rather than adding just a few additional lines to the existing tests.

bmahjour marked an inline comment as done.Feb 19 2020, 1:06 PM
bmahjour added inline comments.
llvm/lib/Analysis/DependenceAnalysis.cpp
3276–3327

We could also check for GEP's inrange modifier and require it unless DisableDelinearizationChecks is set.

We could, however I'd suggest we consider that as a separate patch. There are some peculiarities with GEP's inrange that I need to understand better. In particular, it's not clear to me why the syntax allows inrange to appear before multiple indexes, but the getInRangeIndex() API only allows a single index to be retrieved.

If I understand this conversation https://reviews.llvm.org/D22793?id=65626#inline-194586 correctly, then we can only try to delinearize GEPs when
SrcGEP->getInRangeIndex().getValue() == SrcGEP->getNumIndices(). Is that true?

Meinersbur added a subscriber: pcc.Feb 19 2020, 1:46 PM
Meinersbur added inline comments.
llvm/lib/Analysis/DependenceAnalysis.cpp
3276–3327

We could, however I'd suggest we consider that as a separate patch.

Agreed. Just something to keep in mind when renaming -da-disable-delinearization-checks that might not totally disable non-linear GEPs unless set.

the getInRangeIndex() API only allows a single index to be retrieved.

Yes, and it contradicts the LLVM reference for inrange: https://llvm.org/docs/LangRef.html#id231 . It also seems to be allowed for constant expressions currently.

<result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
... if the load or store would access memory outside of the bounds of the element selected by the index marked as inrange.

For reference, here is the inline comment from @pcc:

No more than one. The rationale is that inrange on a given index will be at least as restrictive as inrange on any earlier index, so there's no point in allowing more than one. I'll document this in the langref.

This seems not to have been documented. However, there are other motivations than @pcc had in mind (vtables) and some may allow non-inrange indices after an inrange one. Furthermore, I think it is problematic if we'd implicitly interpret all subscripts after an inrange subscript as well.

bmahjour updated this revision to Diff 246585.Feb 25 2020, 3:10 PM
bmahjour marked 11 inline comments as done.

Addressed all review comments.

Meinersbur accepted this revision.Feb 25 2020, 3:38 PM

LGTM

llvm/include/llvm/Analysis/DependenceAnalysis.h
930

Is checkSubscript supposed to be public now?

This revision is now accepted and ready to land.Feb 25 2020, 3:38 PM
bmahjour marked 2 inline comments as done.Feb 26 2020, 6:16 AM
bmahjour added inline comments.
llvm/include/llvm/Analysis/DependenceAnalysis.h
930

No, it is still private. The private keyword above was redundant, which is why I removed it.

bmahjour updated this revision to Diff 246777.Feb 26 2020, 10:15 AM
bmahjour marked an inline comment as done.

fix issues found by clang-tidy and clang-format.

Meinersbur added inline comments.Feb 26 2020, 9:13 PM
llvm/include/llvm/Analysis/DependenceAnalysis.h
930

OK, I see.

This revision was automatically updated to reflect the committed changes.