This is an archive of the discontinued LLVM Phabricator instance.

[Polly] Follow uses to create value MemoryAccesses
ClosedPublic

Authored by Meinersbur on Dec 21 2015, 5:24 PM.

Details

Summary

The previously implemented approach is to follow value definitions and create write accesses ("push defs") while searching for uses. This requires the same relatively validity- and requirement conditions to be replicated at multiple locations (PHI instructions, other instructions, uses by PHIs).

We replace this by iterating over the uses in a SCoP ("pull in requirements"), and add writes only when at least one read has been added. It turns out to be simpler code because each use is only iterated over once and writes are added for the first access that reads it. We need another iteration to identify escaping values (uses not in the SCoP), which also makes the difference between such accesses more obvious.

Diff Detail

Repository
rL LLVM

Event Timeline

Meinersbur updated this revision to Diff 43422.Dec 21 2015, 5:24 PM
Meinersbur retitled this revision from to [Polly] Follow uses to create value MemoryAccesses.
Meinersbur updated this object.
Meinersbur added reviewers: grosser, jdoerfert.
Meinersbur added a project: Restricted Project.
Meinersbur added subscribers: llvm-commits, pollydev.

+; TODO: Because of the phi with 3 incoming blocks, the loop for => outofscop is
+; not recognized as a loop, but as a non-affine subregion.

My guess would be this is because ScalarEvolution cannot create an
AddRec for %j. PHIs with multiple incoming blocks are generally not a
problem.

+
+define void @func(i32 %n, double* nonnull noalias %A) {
+entry:
+ br label %for
+
+for:
+ %phi = phi double [0.0, %entry], [0.0, %inc], [%phi, %outofscop]
+ %j = phi i32 [0, %entry], [%j.inc, %inc], [0, %outofscop]
+ %j.cmp = icmp slt i32 %j, %n
+ br i1 %j.cmp, label %body, label %outofscop
+
+body:
+ store double 4.2, double* %A
+ br label %inc
+
+inc:
+ %j.inc = add nuw nsw i32 %j, 1
+ br label %for
+
+outofscop:
+ fence seq_cst
+ br i1 true, label %return, label %for
+
+return:
+ ret void
+}

+; TODO: Because of the phi with 3 incoming blocks, the loop for => outofscop is
+; not recognized as a loop, but as a non-affine subregion.

My guess would be this is because ScalarEvolution cannot create an
AddRec for %j. PHIs with multiple incoming blocks are generally not a
problem.

The loop is correctly identified without the %outofscop block. From the region's standpoint, is also entering the region with the same incoming value. %outofscop adds another surrounding loop.

I think I will look at it again and commit this test case separately as it is not specific to this patch; It was helpful for finding a bug, though;

Meinersbur updated this revision to Diff 46183.Jan 27 2016, 3:03 PM

Rebase to r258998

grosser edited edge metadata.Jan 27 2016, 3:16 PM

This rebase seems broken. It contains three patches of which two are already committed.

Meinersbur updated this revision to Diff 46258.Jan 28 2016, 4:22 AM
Meinersbur edited edge metadata.

Somehow r258947 got mixed-in in the previous diff; rebase to r259041

Hi Michael,

this patch looks like a nice improvement. Also, I like that the patch is very targeted only changing the order a couple of memory writes. I only have some minor questions:

From the commit message:

Because it does not catch all uses (e.g. those by PHIs), some code adds more scalar reads write on the fly, with relatively conditions and possible creation of MemoryAccesses that are not required.

Is this still correct? As far as I see, this patch does not reduce the number of memory accesses modeled. Maybe this functionality was already committed with the previous patch?

Also, I do not yet fully understand why we add new accesses in "test/ScopInfo/invariant-loads-leave-read-only-statements.ll". Was this what the sentence above was about? Could you maybe point out again the reason why we need to add more load accesses.
My feeling was that this is an [NFC] patch when ignoring the order of the generated memory accesses.

lib/Analysis/ScopInfo.cpp
3872 ↗(On Diff #46258)

What about GlobalVariable constants. I assume we want to keep accesses to them, no?

Meinersbur updated this revision to Diff 46540.Feb 1 2016, 8:21 AM
Meinersbur marked an inline comment as done.

Filter ScopRIL for definitions, not uses

Meinersbur updated this object.Feb 1 2016, 8:23 AM

From the commit message:

Because it does not catch all uses (e.g. those by PHIs), some code adds more scalar reads write on the fly, with relatively conditions and possible creation of MemoryAccesses that are not required.

s/relatively conditions/relatively complicated conditions

Is this still correct? As far as I see, this patch does not reduce the number of memory accesses modeled. Maybe this functionality was already committed with the previous patch?

This describes the situation in current trunk.

Also, I do not yet fully understand why we add new accesses in "test/ScopInfo/invariant-loads-leave-read-only-statements.ll". Could you maybe point out again the reason why we need to add more load accesses.

This is something I forgot. Before the ScopRIL.count was tested on definitions (always LoadInsts), after with this it actually iterates over uses. I moved the ScopRIL.count to ensureValueLoad so it also tests whether an access is not required because is it will be hoisted.

This was also the reason behing the TODO in invariant_load_scalar_escape_alloca_sharing.ll, because the ScopRIL.count was checked only after buildPHIAccesses.

My feeling was that this is an [NFC] patch when ignoring the order of the generated memory accesses.

Hi,

thanks for the fast turnaround.

I understood pretty much what you mean, but am still not 100% about this sentence:

possible creation of MemoryAccesses that are not required

Are you saying we could possibly create unnecessary memory accesses in trunk. This commit message makes the impression we now do not generate such redundant accesses, but looking at the test cases the number of accesses does not change. Now, I wonder: does this commit potentially change the number of memory accesses generated? In case it does, do we lack test coverage to see this?

Otherwise, there is just one typo that you introduced with the last update.

lib/Analysis/ScopInfo.cpp
3922 ↗(On Diff #46540)

ConstantFP twice?

Meinersbur updated this revision to Diff 46904.Feb 4 2016, 6:02 AM

Refined constant filter condition (Filter out UndefValue, but also all kinds of structured literal constants). GlobalVariable seems to be the only 'changeable' constant. Not sure about GlobalAlias.

Meinersbur updated this object.Feb 4 2016, 6:03 AM
grosser accepted this revision.Feb 4 2016, 6:07 AM
grosser edited edge metadata.

LGTM

This revision is now accepted and ready to land.Feb 4 2016, 6:07 AM
This revision was automatically updated to reflect the committed changes.