Page MenuHomePhabricator

[Polly][FIX] Adjust execution context of hoisted loads wrt. error domains
ClosedPublic

Authored by jdoerfert on Apr 6 2016, 5:06 AM.

Details

Summary
      If we build the domains for error blocks and later remove them we lose
      the information that they are not executed. Thus, in the SCoP it looks
      like the control will always reach the statement S:
····
                for (i = 0 ... N)
                    if (*valid == 0)
                      doSth(&ptr);
              S:    A[i] = *ptr;
····
      Consequently, we would have assumed "ptr" to be always accessed and
      preloaded it unconditionally. However, only if "*valid != 0" we would
      execute the optimized version of the SCoP. Nevertheless, we would have
      hoisted and accessed "ptr"regardless of "*valid". This changes the
      semantic of the program as the value of "*valid" can cause a change of
      "ptr" and control if it is executed or not.
····
      To fix this problem we adjust the execution context of hoisted loads
      wrt. error domains. To this end we introduce an ErrorDomainCtxMap that
      maps each basic block to the error context under which it might be
      executed. Thus, to the context under which it is executed but an error
      block would have been executed to. To fill this map one traversal of
      the blocks in the SCoP suffices. During this traversal we do also
      "remove" error statements and those that are only reachable via error
      statements. This was previously done by the removeErrorBlockDomains
      function which is therefor not needed anymore.

Diff Detail

Repository
rL LLVM

Event Timeline

jdoerfert updated this revision to Diff 52778.Apr 6 2016, 5:06 AM
jdoerfert retitled this revision from to [Polly][FIX] Adjust execution context of hoisted loads wrt. error domains.
jdoerfert updated this object.
jdoerfert added reviewers: grosser, Meinersbur.
jdoerfert added a subscriber: Restricted Project.
Meinersbur edited edge metadata.Apr 6 2016, 8:37 AM

The summary does not mention this is supposed to fix PR26683 or what was wrong with hoisted loads before.
There's a typo: s/therefor/therefore

include/polly/ScopInfo.h
1320 ↗(On Diff #52778)

Could you explain somewhere what an 'error context' is?

1427–1441 ↗(On Diff #52778)

Maybe this and the implementation could be moved to where removeErrorBlockDomains was to make clear it's a replacement.

lib/Analysis/ScopInfo.cpp
2178–2185 ↗(On Diff #52778)

The comment still applies for the most part, doesn't it?

2212–2223 ↗(On Diff #52778)

I have not better immediate solution but would like to remark the following:

  • This pattern occurs multiple times (buildDomainsWithBranchConstraints, propagateDomainConstraints, propagateErrorConstraints) so it might make sense to wrap it.
  • The comment of ReversePostOrderTraversal mentions it is expensive to create
  • Why does it need to be recursive over nested regions, instead non-recursive over BB reverse post-order?
2242 ↗(On Diff #52778)

else

3070–3071 ↗(On Diff #52778)

Isn't it enough just use the union of all error domains (or the context) instead per-stmt?

That domain could contain the hoisted value itself, but which could be just outprojected.

test/ScopInfo/error-blocks-1.ll
10–20 ↗(On Diff #52778)

Because there is no load hoisting, what dos this patch change in this test case?

test/ScopInfo/error-blocks-2.ll
9 ↗(On Diff #52778)

Are the invariant accesses ordered? Is the previous invariant access for MemRef_valid guaranteed to be executed before this and to use valid_val ?

The summary does not mention this is supposed to fix PR26683 or what was wrong with hoisted loads before.

It does and the code shown there is a test case here. I can mention it.

There's a typo: s/therefor/therefore

Thx.

include/polly/ScopInfo.h
1320 ↗(On Diff #52778)

Sure.

1427–1441 ↗(On Diff #52778)

If you think that is better, sure.

lib/Analysis/ScopInfo.cpp
2178–2185 ↗(On Diff #52778)

I will recycle some parts.

2212–2223 ↗(On Diff #52778)

All good points which I partially addresses a while ago but should have mentioned somehwere more prominent when we introduced the new domain generation. Here is my take but we might want to put that somehwere [either as comment or code]:

  1. To wrap the pattern is something I would like to see but I do not have a good solution for (different recursive functions and signatures). I will gladly review and support that though.
  1. I saw the comment early on and my take was that we could change the implementation of that algorithm if it was actually "slow". So far I did not (look for or) observe any slowdown due to the RPO and the first way to counter it would be to share/reuse/cache the same RTraversal for the different regions. Though, due to the little number of regions this might not change much.
  1. We can either build a RPO traversal for the whole CFG or a Region which can contain SubRegions as nodes. There was an initial assumption that a SCoP does not span enough of the CFG to make it worthwile to build a traversal for the whole thing, thus we went with the latter. Since it operates on the RegionTree not on the CFG we should not pay more due to the recursive nature of the traversal. In fact we should pay only for what we need. That said, I do not know if it will make sense to go with a function RPO traversal at some point (if we have a function pass maybe) or not.
2242 ↗(On Diff #52778)

I did this on purpose but I can do else too.

3070–3071 ↗(On Diff #52778)

Isn't it enough just use the union of all error domains (or the context) instead per-stmt?

That would be an overapproximation that would basically make error statements and invariant loads "exclusive" again. Thus, the domain of a error-block could not depend on an invariant load, now it can.

That domain could contain the hoisted value itself, but which could be just outprojected.

I do not get this, sorry.

test/ScopInfo/error-blocks-1.ll
10–20 ↗(On Diff #52778)

This is a test for the removal of error statements that was poorly covered before.

test/ScopInfo/error-blocks-2.ll
9 ↗(On Diff #52778)

Are the invariant accesses ordered?

Not anymore. That did not work out well ... The code generation will "order" them on demand.

grosser edited edge metadata.Apr 7 2016, 3:43 AM

Hi Johannes,

thanks for this patch. It seems you guys already discussed this topic a lot and you seem to converge on the technical review. This is great. I will look at the updated version than once again.

Regarding this patch, it would be good to describe briefly in the commit message the original issue (at best with an example), the solution, and the reason alternatives have not been chosen. With this information one can understand this patch without having to dig through the previous discussion. Most likely you can just take some of the text that was in the bug report to discuss this.

One alternative that I learned about in the bug report was to revert an earlier patch (which svn id?). This would have fixed this bug here, but would have caused unnecessarily complicated domains. Right? Can we give a motivation why this would not be the case with the current approach?

s/priviously/previously/g

lib/Analysis/ScopInfo.cpp
3060 ↗(On Diff #52778)

Add __isl_give here as well.

test/ScopInfo/error-blocks-1.ll
10–20 ↗(On Diff #52778)

If this test case is not changed / affected by this commit, I would suggest to commit it ahead of time.

jdoerfert updated this revision to Diff 52900.Apr 7 2016, 4:53 AM
jdoerfert marked 8 inline comments as done.
jdoerfert edited edge metadata.

Changed according to comments

jdoerfert updated this object.Apr 7 2016, 4:54 AM
Meinersbur added inline comments.Apr 7 2016, 4:54 AM
lib/Analysis/ScopInfo.cpp
2242 ↗(On Diff #52778)

Please do; I was looking for a leak for the case IsErrorBlock == true when I noticed that it's the negated condition.

3070–3071 ↗(On Diff #52778)

That would be an overapproximation that would basically make error statements and invariant loads "exclusive"
again. Thus, the domain of a error-block could not depend on an invariant load, now it can.

OK, I see. Any ErrorDomainCtxMap is not only valid in the scop context, but even required for checking the context.

But in case of "union of all error domains", the validity check is a conjunction of all error domains. If one of them is is true, the other ones can be just undef and not influence the outcome.

At a second though, I think we'd have a problem with the evaluation order. One error domain needs to be checked first and might be only allowed to be dereferenced depending on another error domain.

Meinersbur accepted this revision.Apr 7 2016, 4:54 AM
Meinersbur edited edge metadata.
This revision is now accepted and ready to land.Apr 7 2016, 4:54 AM

Thanks Johannes for the improved commit message. This LGTM now.

Ah, and please reference the bug report id in your commit message.

This revision was automatically updated to reflect the committed changes.