Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Avoid migrating existing patches. Phabricator shutdown timeline

[LAA] Add Memory dependence remarks.

Authored by malharJ on Aug 19 2021, 6:11 AM.



Adds new optimization remarks when vectorization fails.

More specifically, new remarks are added for following 4 cases:

  • Backward dependency
  • Backward dependency that prevents Store-to-load forwarding
  • Forward dependency that prevents Store-to-load forwarding
  • Unknown dependency

It is important to note that only one of the sources
of failures (to vectorize) is reported by the remarks.
This source of failure may not be first in program order.

A regression test has been added to test the following cases:

a) Loop can be vectorized: No optimization remark is emitted
b) Loop can not be vectorized: In this case an optimization
remark will be emitted for one source of failure.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
sdesmalen added inline comments.Nov 10 2021, 5:32 AM

Can there only be a single unsafe/unknown dependence? Or can there be more?


unnecessary change.


These changes here obfuscate the report that's generated by LAA when running loop(print-access-info). The information printed was:

Loop access info in function 'store_with_pointer_phi_incoming_phi':
 The compiler can't determine the cause of the issue.
          %v8 = load double, double* %arrayidx, align 8 ->
          store double %mul16, double* %ptr.2, align 8

The information that is added by the remark is:

Loop access info in function 'store_with_pointer_phi_incoming_phi':
    Report: unsafe dependent memory operations in loop. Use #pragma loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
Unknown data dependence. Memory location is the same as accessed at <UNKNOWN LOCATION>. 
 The compiler can't determine the cause of the issue.
          %v8 = load double, double* %arrayidx, align 8 ->  
          store double %mul16, double* %ptr.2, align 8

The extra information here isn't particularly useful, especially because it doesn't have any line information. Perhaps the code should disable the extra info if the location is unknown/unavailable.


nit: Please remove the newline.


Instead of using + to concatenate two strings, let this be handled by raw_ostream using << .


Please CHECK-NEXT for the full line that is printed.

This revision now requires changes to proceed.Nov 10 2021, 5:32 AM
malharJ updated this revision to Diff 387087.Nov 14 2021, 7:44 AM
malharJ marked 6 inline comments as done.
  • Rebased patch onto main
  • Updated code to avoid printing <unknown> when no debug info is present
  • minor formatting updates

thanks for pointing it out. Removed it.


There can be more.

But we are only emitting (as a remark) the first one found.


Ok, I agree but there's an issue here ..

LoopAccessInfo::recordAnalysis() cant be called from within the function ( AccessAnalysis::canCheckPtrAtRT() ).
as it's not a member of AccessAnalysis class.

Perhaps one way to do it would be for AccessAnalysis::canCheckPtrAtRT() to accept a parameter
by reference and then we can use the value after the call as input to recordAnalysis() here.
But I'm afraid this approach is not much less verbose than current approach pf having UncomputablePtr as a member variable.


Thanks for pointing this out. Done.


can we keep this newline ?

The current text (Note: it was not introduced by this patch) is too long already:

unsafe dependent memory operations in loop. Use "#pragma loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop



But now the code will only print the remaining text: "Memory location is the same as ...etc"
when debug info is available.

so this is no longer an issue (since this test does not contain any debug info/metadata).

sdesmalen added inline comments.Nov 24 2021, 9:16 AM

UncomputablePtr doesn't really need a separate variable in AccessAnalysis, since it's only set by one function, and used directly by the function that calls it.

You can change canCheckPtrAtRT to take Value **UncomputablePtr = nullptr, and in canCheckPtrAtRT assign the value if the value is requested:

if (UncomputablePtr)
  *UncomputablePtr = Access.getPointer();

Where you call it, you can have:

Value *UncomputablePtr = nullptr;
bool CanDoRTIfNeeded =
    Accesses.canCheckPtrAtRT(*PtrRtChecking, PSE->getSE(), TheLoop,
                             SymbolicStrides, false, &UncomputablePtr);

if (!CanDoRTIfNeeded) {
  if (auto *I = dyn_cast_or_null<Instruction>(UncomputablePtr))
    recordAnalysis("UnknownArrayBounds", I)

Can you split this change out into a separate patch with a test that demonstrates the change?


It shows here that the dependences are already collected. You can iterate Dependences to find the dependence that isn't safe, so that you don't have to add UnsafeDependence and maintain that separately.


I guess this can be more briefly written as:

Loc = I->getDebugLoc();
if (auto *PtrI = dyn_cast_or_null<Instruction>(getPointerOperand(I)))
  Loc = PtrI->getDebugLoc();

(at which point it probably doesn't require a separate function anymore, especially since it has only one use).


This seems like a case that should be added to getPointerOperand ?


It now passes in I as the instruction, but the debug location of the remark seems unchanged in the test. What value is this adding?


Just leave the name as CantIdentifyArrayBounds ?

RKSimon resigned from this revision.Nov 29 2021, 2:50 AM

Sorry this is not an area I known enough about to review

fhahn added inline comments.Nov 29 2021, 3:20 AM

Does this need to be a shared_ptr? If you want to encode the fact that it may not be set, using Optional may be a better choice. Or you could initialize it to NoDep in case there is no unsafe dependence.

malharJ updated this revision to Diff 394825.EditedDec 16 2021, 4:42 AM
malharJ marked 4 inline comments as done.
  • Moved the changes for testing unknown array bounds to new patch: D115873
  • addressed review comments
malharJ retitled this revision from [LAA] Add Memory dependence and unknown bounds remarks. to [LAA] Add Memory dependence remarks..Dec 16 2021, 4:58 AM
malharJ edited the summary of this revision. (Show Details)
malharJ marked an inline comment as done.Dec 16 2021, 5:19 AM
malharJ added inline comments.

I've removed it now based on review comment by sdesmalen, so this is no longer an issue.


That's a good point, thanks !


I've removed this function and inlined the code since there is only one usage.


This was essentially an issue in the original test,
The debug info (!35) used by the loop was incorrect.
I've changed it to use !32 now.

I have corrected the issue in my new patch:

malharJ updated this revision to Diff 394853.EditedDec 16 2021, 6:42 AM
malharJ marked an inline comment as done.
  • rebased patch

Premerge îs failing due to (a possible bug introduced by) recent changes:

malharJ updated this revision to Diff 394998.Dec 16 2021, 2:24 PM
  • Updated unit test
fhahn added inline comments.Dec 28 2021, 9:53 AM

This change is unnecessary, keep the include in the .cpp


It seems like it would make things easier to read if the logic would be in a separate function, with proper documentation what it is supposed to do?


nit: variable names here start with upper cases


nit: dependence instead of dependency, to be in line with the terminology used elsewhere in the file?


Is this true? Unless I miss something, this emits a remark for the first unsafe dependence?


For remarks, it might be good to check the full yaml generated, as in llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-missed.ll


nit: check here not needed?


nit: easier to read if this is the last block


Is this needed? Same for other tests.


is this needed?


can the debug lock be trimmed down a bit?

malharJ updated this revision to Diff 396853.Jan 1 2022, 4:54 AM
malharJ marked 8 inline comments as done.
  • minor formatting updates
  • moved remark emission code for unsafe mem dependences into a function
  • got rid of some more debuginfo in the LIT test.

you are correct, the comment is no longer correct. I've updated it now in the new function.


Can you please explain why this is not needed ?


thanks. removed it now. changed the type of %n from i32 to i64.



  • Changed functions to use same parameters (and same ordering). This helped reduce !DISubroutineType metadata.
  • Also reduced some !DILocation (that were not required) from test_forwardButPreventsForwarding_dep()
malharJ updated this revision to Diff 397552.Jan 5 2022, 6:09 AM
malharJ marked an inline comment as done.
  • Added YAML to the LIT test

Hi @malharJ thanks for splitting up and simplifying this patch, this is an improvement. I left a few more comments, and also found that the remark itself needs a bit of work to separate it from the original message. Maybe you can also update the tests to check the full context, to ensure this no longer happens?


Can this be an assert?


Can this be an assert?


nit: please move this code below the switch


This dyn_cast will cause a segfault if the value returned by getPointerOperand is a nullptr. This needs to be dyn_cast_or_null. It would be good to have a test for this case.

Also, there is no test for the MemSetInst either. Can you add one?


When I try this out with Clang, I see:

Use #pragma loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loopBackward loop carried data dependence

The issue here is the loopBackward (no period and space between remarks)

malharJ updated this revision to Diff 403808.Jan 27 2022, 2:38 PM
malharJ marked 3 inline comments as done.
  • Rebased patch.
  • Addressed review comments.

I guess not ...

Looking at the definition of MemoryDepChecker::getDependences(),
it can return a nullptr if RecordDependences is false.
and that happens when number of dependences exceeds MaxDependences.
Given that max-dependences is a command line option, it could have any value.


I dont think so ...

We will still need the check to see if we can find a dependency type that is unsafe for vectorization.

Just the presence of Deps is not sufficient to say above will be satisfied, because it can contain
Forward or Backwardvectorizable dependencies too (but we dont care about emitting any remark for them as they can be vectorized)


Ok, I've changed it to dyn_cast_or_null (I think you had already mentioned it an earlier review comment but I missed it).
Can we please avoid adding more tests, from what I understand it may not be very crucial to test getPointerOperand()
as we just avoid emitting debug location information if it returns null pointer.

Also, I tried writing a test case for memset case and I found that it doesnt even make sense to have a case here for memset.
Apparently LAA does not consider any instruction that is not a load/store
I see this code in LoopAccessInfo::analyzeLoop()

if (!St) {
  recordAnalysis("CantVectorizeInstruction", St)
      << "instruction cannot be vectorized";
  HasComplexMemInst = true;

So it looks like a different remark is emitted and an early exit is taken because memset is considered as a "complex" memory instruction.

considering the above, I have removed the memset part from the patch.


I'm not very clear on the issue here .. they are already printed on separate lines ?

Regardless I've updated the tests to print the entire message.
(The reason I was avoiding it earlier was because it was too long a message, and didn't appeal to me as part of the remark.
However, it was not added by my patch so I will not be changing it.)

malharJ updated this revision to Diff 404004.EditedJan 28 2022, 6:29 AM
  • minor update to commit message to trigger phabricator builds.

Build has failed but test failures are unrelated to this patch.

sdesmalen accepted this revision.Feb 1 2022, 6:08 AM

LGTM with the nits addressed.


nit: can be removed if you address my comment below.



if (!Deps || 
    llvm::find_if(Deps, [](const MemoryDepChecker::Dependence &D) { .. }) == Deps->end())

nit: this is more concisely written as:

if (Instruction *I = Dep.getSource(*this)) {
  DebugLoc SourceLoc = I->getDebugLoc();
  if (auto *DD = dyn_cast_or_null<Instruction>(getPointerOperand(I)))
    SourceLoc = DD->getDebugLoc();
  R << " Memory location is the same as accessed at "
    << ore::NV("Location", SourceLoc);
This revision is now accepted and ready to land.Feb 1 2022, 6:08 AM
This revision was landed with ongoing or failed builds.Feb 2 2022, 4:08 AM
This revision was automatically updated to reflect the committed changes.