This is an archive of the discontinued LLVM Phabricator instance.

Handle non-unique edges in edge-dominance
ClosedPublic

Authored by anemet on May 25 2017, 9:39 PM.

Details

Summary

This removes a quadratic behavior in assert-enabled builds.

GVN propagates the equivalence from a condition into the blocks guarded by the
condition. E.g. for 'if (a == 7) { ... }', 'a' will be replaced in the block
with 7. It does this by replacing all the uses of 'a' that are dominated by
the true edge.

For a switch with N cases and U uses of the value, this will mean N * U calls
to 'dominates'. Asserting isSingleEdge in 'dominates' make this N^2 * U
because this function checks for the uniqueness of the edge. I.e. traverses
each edge between the SwitchInst's block and the cases.

The change removes the assert and makes 'dominates' works correctly in the
presence of non-unique edges.

This brings build time down by an order of magnitude for an input that has
~10k cases in a switch statement.

Diff Detail

Repository
rL LLVM

Event Timeline

anemet created this revision.May 25 2017, 9:39 PM
sanjoy added a subscriber: sanjoy.May 25 2017, 9:46 PM
davide edited edge metadata.May 25 2017, 9:47 PM

Can you provide such a test?

Can you provide such a test?

Do you mean in the testsuite as part of the patch?

Can you provide such a test?

Do you mean in the testsuite as part of the patch?

Adding that to the testsuite, if it's not there, would be great.
If you can attach it here (or point a location where I can fetch, I'd like to run it under a profiler :)

Adding that to the testsuite, if it's not there, would be great.
If you can attach it here (or point a location where I can fetch, I'd like to run it under a profiler :)

I can't really share the code as is but should be able to generate something that demonstrates the issue.

I don't think we want to add it to the LLVM tests though. I.e. what would be the failure mode?

Adding that to the testsuite, if it's not there, would be great.
If you can attach it here (or point a location where I can fetch, I'd like to run it under a profiler :)

I can't really share the code as is but should be able to generate something that demonstrates the issue.

Sure.

I don't think we want to add it to the LLVM tests though. I.e. what would be the failure mode?

Maybe not the llvm-test suite, but test-suite [this: https://github.com/llvm-mirror/test-suite] (as it's starting to include testcases to show compile time regressions).

dberlin edited edge metadata.May 26 2017, 8:14 AM

As I said last year, i believe, we should just remove this assert.
It doesn't help anything. The callers literally can't handle it any better if they want real dominance answers.
There isn't anywhere else in llvm we assert because "a thing may become quadratic if you do dumb things", and the assert itself is quadratic.

As I said last year, i believe, we should just remove this assert.
It doesn't help anything. The callers literally can't handle it any better if they want real dominance answers.

I thought GVN was a good counter example which bails early in the presence of duplicated edges: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Scalar/GVN.cpp#L1733

There isn't anywhere else in llvm we assert because "a thing may become quadratic if you do dumb things", and the assert itself is quadratic.

Is it the problem that dominates is quadratic or that it returns the wrong answer in the presence of duplicated edges?

As I said last year, i believe, we should just remove this assert.
It doesn't help anything. The callers literally can't handle it any better if they want real dominance answers.

I thought GVN was a good counter example which bails early in the presence of duplicated edges: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Scalar/GVN.cpp#L1733

This is not actually any more efficient than dominance would have answered it.
Both have to do precisely the same thing:
Look at the predecessors and see if any are the same :)
That has the same time bound no matter how you do it.
At least how LLVM has implemented "edge dominance", an edge does not dominate a block unless (basically) start dominates end and start, end is unique.
(i'm ignoring critical edges here for a moment).

The uniqueness test takes the same time no matter whether the caller does it or dominates does it.

There isn't anywhere else in llvm we assert because "a thing may become quadratic if you do dumb things", and the assert itself is quadratic.

Is it the problem that dominates is quadratic or that it returns the wrong answer in the presence of duplicated edges?

I believe it is now broken, but it was not before.
Before, it was only quadratic over multiple calls
IE i repeatedly query the same set.
Most callers only make one call.

It has always been quadratic in the critical edge case.

It is possible to make both edge dominance and critical edge dominance constant time by a variety of methods if we wanted.
A trivial one for edge dominance:

  1. Have the dom tree maintain a set of non-singular edges that it builds at construction time.

Return false if it's in the set.

another not so trivial:

  1. convert the multiple edges into the equivalent block form using virtual basic blocks.

Given LLVM defines edge dominance in a way that means non-unique edges never dominate their end, this is a waste of time.

For critical edges, there are also a number of ways by adding virtual links or blocks or ... to the dom tree.

Also note:
GVN deliberately does *not* compute dominance answers for the other single edge case.
(see propagateequality and isOnlyReachableViaThisEdge).

FWIW: Using edge dominance for all of this is also, IMHO, not a great idea, but i'm ignoring that.

@davide, I have an ll file for you. On my box, with an assert-enabled opt it gives:

$ time ./bin/opt -gvn many-cases.ll -o /dev/null

real 1m48.190s
user 1m47.337s
sys 0m0.447s

with the patch:

$ time ./bin/opt -gvn many-cases.ll -o /dev/null

real 0m0.556s
user 0m0.353s
sys 0m0.042s

anemet updated this revision to Diff 100466.May 26 2017, 1:31 PM

Removed the asserts. As Danny put it:

"Given LLVM defines edge dominance in a way that means non-unique edges never dominate their end, this is a waste of time."

In other words, there is no need for an assert here since it's not the case
that the answer would be wrong or would take more time to compute than for
unique edges. It's simply that the answer would always be non-dominance by
how domininance of edges is defined.

Also added a comment the explain the situation with non-unique edges.

anemet edited the summary of this revision. (Show Details)May 26 2017, 1:33 PM

Note: i haven't thought out whether the pred test it does in most places will actually give the right answer.
At one point, it tested whether the edge was unique and returned false, this got turned into the current assert.
You may have to add that back.

If that is too slow, we could add the non-singular edge set and invalidate/recompute it. It should suffice to invalidate it anywhere we invalidate the dfs numbers.
Then, like dominates calls updateDFSNumber if they are invalid and it hits 32 queries, these function could do something similar and call updateNonSingleEdges if they are queried too much.

It's worth noting that the (edge, phi use) case is theoretically wrong after your xhange, right now, but it may not matter.

The above code will now claim that dominates(edge, phi use) is "true" for *any* use from the same block, when there are multiple incoming edges to the phi.
that *would* definitely be wrong if we allowed something like
bb1:
switch x {
case 1 : goto bb2;
case 2 : goto bb2;
}
bb2:
phi([1, bb1], [2, bb1]).

Because right now it will claim the second edge dominates the first use, etc.
However, we only allow multiple same-block edges to a phi if the values are the same:
bb1:
switch x {
case 1 : goto bb2;
case 2 : goto bb2;
}
bb2:
phi([x, bb1], [x, bb1]).

So i'm not sure returning true will break anything.
It may :)

Removed the asserts. As Danny put it:

"Given LLVM defines edge dominance in a way that means non-unique edges never dominate their end, this is a waste of time."

In other words, there is no need for an assert here since it's not the case
that the answer would be wrong or would take more time to compute than for
unique edges. It's simply that the answer would always be non-dominance by
how domininance of edges is defined.

I'm not sure I agree with this. To be clear, say we have:

bb0:
  br i1 undef, label %bb1, label %bb1

bb1;
  ...

are you suggesting that dominates([bb0->bb1], bb1) will be false anyway, and there is no specific need to check isSingleEdge() at all? That does not seem to be the case -- I think DominatorTree::dominates([bb0->bb1], bb1) will return true. I think you need to change the loop in dominates over preds(End) to return false if the if (BB == Start) condition is taken more than once.

The external property this affects is cases like

bb0:
  br i1 %cond, label %bb1, label ...

bb1;
  ...

Today if the bb0 -> bb1 edge dominates some use of %cond then said use can be replaced with i1 true, but with your change that will no longer hold.

Removed the asserts. As Danny put it:

"Given LLVM defines edge dominance in a way that means non-unique edges never dominate their end, this is a waste of time."

In other words, there is no need for an assert here since it's not the case
that the answer would be wrong or would take more time to compute than for
unique edges. It's simply that the answer would always be non-dominance by
how domininance of edges is defined.

I'm not sure I agree with this. To be clear, say we have:

bb0:
  br i1 undef, label %bb1, label %bb1

bb1;
  ...

are you suggesting that dominates([bb0->bb1], bb1) will be false anyway, and there is no specific need to check isSingleEdge() at all?

yes, the current callers will all do the equivalent of returning false.

See isReachableOnlyByThisEdge, and the GVN case pointed out earlier.

> That does not seem to be the case -- I think DominatorTree::dominates([bb0->bb1], bb1) will return true.

This is the part where i said the patch is likely broken.
The thing we did before the assert was to return false if !singleEdge.

Hence my comment that "i'm not sure the pred loop will check what we need to have the same behavior we used to"

Today if the bb0 -> bb1 edge dominates some use of %cond then said use can be replaced with i1 true, but with your change that will no longer hold.

Today, any caller that asks if edge bb0->bb1 dominates some use of cond, it will assert :)

It will not return true ;)
Previous to the assert, we returned false.
This is why it says:
"// Assert that we have a single edge. We could handle them by simply
returning false. "

Now, you are right that there are situation we *could* return true, but we wouldn't :)

Today if the bb0 -> bb1 edge dominates some use of %cond then said use can be replaced with i1 true, but with your change that will no longer hold.

Today, any caller that asks if edge bb0->bb1 dominates some use of cond, it will assert :)

I see what you mean -- since we're only changing behavior in the case we'd have failed an assert before (i.e. before this change), the behavior change is correct by definition.

Now, you are right that there are situation we *could* return true, but we wouldn't :)

We wouldn't return true today (i.e. without this patch), but we would return true once this patch is applied. I was trying to argue that it makes more sense to return false for non-unique edges, since it preserves the "if the bb0 -> bb1 edge dominates some use of %cond then said use can be replaced with i1 true" reasoning. The reasoning holds today on all of the cases where the antecedent, "if the bb0 -> bb1 edge dominates some use of %cond", is valid (i.e. does not assert). With this change, we will make the antecedent valid in some cases where that implication won't hold, which is what I'm suggesting we avoid.

I certainly agree that if we're not returning false for the non-unique edge case then that will cause bugs later on.

Can I add unit-tests for edge-domination somehow? I am trying to test this with allowing non-unique edges in GVN but that won't fly as regression test.

I certainly agree that if we're not returning false for the non-unique edge case then that will cause bugs later on.

Can I add unit-tests for edge-domination somehow? I am trying to test this with allowing non-unique edges in GVN but that won't fly as regression test.

I'd just go with a regular C++ test case in unittests/

dberlin added a comment.EditedMay 26 2017, 3:17 PM

Today if the bb0 -> bb1 edge dominates some use of %cond then said use can be replaced with i1 true, but with your change that will no longer hold.

Today, any caller that asks if edge bb0->bb1 dominates some use of cond, it will assert :)

I see what you mean -- since we're only changing behavior in the case we'd have failed an assert before (i.e. before this change), the behavior change is correct by definition.

Now, you are right that there are situation we *could* return true, but we wouldn't :)

We wouldn't return true today (i.e. without this patch), but we would return true once this patch is applied. I was trying to argue that it makes more sense to return false for non-unique edges, since it preserves the "if the bb0 -> bb1 edge dominates some use of %cond then said use can be replaced with i1 true" reasoning. The reasoning holds today on all of the cases where the antecedent, "if the bb0 -> bb1 edge dominates some use of %cond", is valid (i.e. does not assert). With this change, we will make the antecedent valid in some cases where that implication won't hold, which is what I'm suggesting we avoid.

Oh yeah, i think we are all in violent agreement about the latter. We need to turn it back into a check that returns false, one way or the other.

I certainly agree that if we're not returning false for the non-unique edge case then that will cause bugs later on.

Can I add unit-tests for edge-domination somehow? I am trying to test this with allowing non-unique edges in GVN but that won't fly as regression test.

I'd just go with a regular C++ test case in unittests/

OK, will do. Thanks, guys!

anemet updated this revision to Diff 100807.May 30 2017, 5:36 PM

This version handles edge-dominance in the presence of non-unique edges.

anemet retitled this revision from Remove a quadratic behavior in assert-enabled builds to Handle non-unique edges in edge-dominance.May 30 2017, 5:37 PM
anemet edited the summary of this revision. (Show Details)
anemet added a reviewer: sanjoy.
sanjoy accepted this revision.Jun 4 2017, 9:32 PM

lgtm

unittests/IR/DominatorTreeTest.cpp
302 ↗(On Diff #100807)

Any reason why these can't be EXPECT_TRUE and EXPECT_FALSE?

This revision is now accepted and ready to land.Jun 4 2017, 9:32 PM
anemet marked an inline comment as done.Jun 5 2017, 9:21 AM
This revision was automatically updated to reflect the committed changes.