This is an archive of the discontinued LLVM Phabricator instance.

[LAA] Support pointer phis in loop by analyzing each incoming pointer.
ClosedPublic

Authored by fhahn on Apr 26 2021, 5:34 AM.

Details

Summary

SCEV does not look through non-header PHIs inside the loop. Such phis
can be analyzed by adding separate accesses for each incoming pointer
value.

This results in 2 more loops vectorized in SPEC2000/186.crafty and
avoids regressions when sinking instructions before vectorizing.

Diff Detail

Event Timeline

fhahn created this revision.Apr 26 2021, 5:34 AM
fhahn requested review of this revision.Apr 26 2021, 5:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2021, 5:34 AM
Meinersbur accepted this revision.Apr 26 2021, 2:00 PM

Makes sense to me.

Pre-merge bot could not apply the patch, possibly because you diffed it against a commit that is not upstream.

llvm/lib/Analysis/LoopAccessAnalysis.cpp
1947
2001
This revision is now accepted and ready to land.Apr 26 2021, 2:00 PM
fhahn updated this revision to Diff 340813.Apr 27 2021, 6:03 AM
fhahn marked 2 inline comments as done.

Rebased and replaced auto with Use

Makes sense to me.

Pre-merge bot could not apply the patch, possibly because you diffed it against a commit that is not upstream.

Yes when I uploaded the patch I did not yet commit the initial change for the tests. That should be done now in a950f66de25f :)

This revision was landed with ongoing or failed builds.Apr 28 2021, 12:20 PM
This revision was automatically updated to reflect the committed changes.

I noticed that loop-distribute started crashing with this patch, see
https://bugs.llvm.org/show_bug.cgi?id=50288

Do you need to check anything about the CFG before you just grab the PHI operand values? For example, if there's a loop, you could have a PHI that refers to another PHI in the same basic block, and I suspect that would mess up the analysis.

I noticed that loop-distribute started crashing with this patch, see
https://bugs.llvm.org/show_bug.cgi?id=50288

Do you need to check anything about the CFG before you just grab the PHI operand values? For example, if there's a loop, you could have a PHI that refers to another PHI in the same basic block, and I suspect that would mess up the analysis.

Thanks for the report. I think the issue here is slightly different: MemoryDepChecker does not collect the memory instructions for the translated pointers (the incoming values of the phi), but LoopDistribute tries to look them up. I put up D102266 which should address this.

Do you need to check anything about the CFG before you just grab the PHI operand values? For example, if there's a loop, you could have a PHI that refers to another PHI in the same basic block, and I suspect that would mess up the analysis.

Thanks for the report. I think the issue here is slightly different: MemoryDepChecker does not collect the memory instructions for the translated pointers (the incoming values of the phi), but LoopDistribute tries to look them up. I put up D102266 which should address this.

I wasn't commenting specifically in response to the crash report; just a general concern from reading the patch. Are you confident the issue I'm describing isn't a problem?

fhahn added a comment.Sep 3 2021, 7:43 AM

Do you need to check anything about the CFG before you just grab the PHI operand values? For example, if there's a loop, you could have a PHI that refers to another PHI in the same basic block, and I suspect that would mess up the analysis.

...

I wasn't commenting specifically in response to the crash report; just a general concern from reading the patch. Are you confident the issue I'm describing isn't a problem?

Oh right, sorry about that and the delay! I added such a test (phi uses another phi as incoming value in the same block as store_with_pointer_phi_in_same_bb_use_other_phi) in D102266. If the incoming value is another phi, then when won't be able to compute the bounds, due to the phi. We could traverse them further to handle more cases, but it should not be an issue for correctness IIUC. Does that make sense?