This is an archive of the discontinued LLVM Phabricator instance.

Add control dependence computation that computes control dependence equivalence classes.
AbandonedPublic

Authored by dberlin on Mar 23 2015, 3:45 PM.

Details

Reviewers
hfinkel
Summary

Unlike other implementations commonly found, this does not require either a post-dominator
tree or post-dominance frontiers.
It computes the relationship in linear time and space, and in fact, requires less time
that computing post-dominators!

If we wanted to, we could make this incrementally updated in O(sqrt(N) log n) time.

(This is currently used by newgvn to solve a problem GVN has of trying to figure out where it can propagate edge equivalences to)

Diff Detail

Event Timeline

dberlin updated this revision to Diff 22526.
dberlin retitled this revision from to Add control dependence computation that computes control dependence equivalence classes. Unlike other implementations commonly found, this does not require either a post-dominator tree or post-dominance frontiers. It computes the relationship in....
dberlin updated this object.
dberlin edited the test plan for this revision. (Show Details)
dberlin added reviewers: chandlerc, hfinkel.
dberlin added a subscriber: Unknown Object (MLST).
  • Sigh, here are the tests

Updating D8568: Add control dependence computation that computes control dependence equivalence classes.

Unlike other implementations commonly found, this does not require either a post-dominator
tree or post-dominance frontiers.
It computes the relationship in...

dberlin retitled this revision from Add control dependence computation that computes control dependence equivalence classes. Unlike other implementations commonly found, this does not require either a post-dominator tree or post-dominance frontiers. It computes the relationship in... to Add control dependence computation that computes control dependence equivalence classes. .Mar 23 2015, 3:47 PM
dberlin updated this object.
anemet added a subscriber: anemet.Mar 24 2015, 10:12 AM
anemet added inline comments.
test/Analysis/ControlDependence/testdiamond.ll
4–14

Hi Danny,

Quick question, is this control equivalence or control dependence? Without looking at the details, it looks more like control equiv to me. Can this analysis also provide control dependence?

Also probably a silly question, but aren't 'different' and 'same' control-equiv here? They both seem to be control-depended on start, no?

dberlin retitled this revision from Add control dependence computation that computes control dependence equivalence classes. to Add control dependence computation that computes control dependence equivalence classes..
dberlin updated this object.

Update documentation, handle capping backedges

Actually, I thought of a case that won't work with that formulation:

Given
int a, b;
int main(void)
{

do {
    do {
       b = a +1;
   }
        while (b);
} while (a);
return 0;

}

b = a+ 1 is control dependent on both the while test edges.
Both both loops are in the same CDEQ set (for the same reason. In fact,
simplifycfg will just make one single huge loop out of this. )

So when you have things that are control dependent on multiple nodes in
nested loops, you won't be able to determine both of the edges it's
dependent on, since the equivalence class is essentially the set that is
result of an equality query. So you can't recover CD directly from it.
You can recover info about CONDS from it, you just don' t know what the
answer is.

That is, CDEQ(w) is exactly the set of of things {for all blocks v in CFG |
CONDS(v) == CONDS(w)}

So you know that things in the same set must have the same CONDS, you just
don't know what that CONDS is.

However, you do know the answer to what CONDS is limited to the edges
between the CDEQ blocks, and the boundary edges of the CDEQ blocks (IE the
edges to a different equivalence class).

I would bet you could compute this in linear time from CDEQ.
From CONDS, you can get CD.

Though I admit, i'm curious what you would want the CD sets for.
Most things i've seen really want control-region info (so they can treat
all things equivalent as one large basic block) or things like SESE regions
(which can be computed in linear time from these sets) or the program
structure tree or whatever.
These are all computable directly from these sets.

Actually, I thought of a case that won't work with that formulation:

Given
int a, b;
int main(void)
{

do {
    do {
       b = a +1;
   }
        while (b);
} while (a);
return 0;

}

b = a+ 1 is control dependent on both the while test edges.
Both both loops are in the same CDEQ set (for the same reason. In fact,
simplifycfg will just make one single huge loop out of this. )

So when you have things that are control dependent on multiple nodes in
nested loops, you won't be able to determine both of the edges it's
dependent on, since the equivalence class is essentially the set that is
result of an equality query. So you can't recover CD directly from it.
You can recover info about CONDS from it, you just don' t know what the
answer is.

That is, CDEQ(w) is exactly the set of of things {for all blocks v in CFG |
CONDS(v) == CONDS(w)}

So you know that things in the same set must have the same CONDS, you just
don't know what that CONDS is.

However, you do know the answer to what CONDS is limited to the edges
between the CDEQ blocks, and the boundary edges of the CDEQ blocks (IE the
edges to a different equivalence class).

I would bet you could compute this in linear time from CDEQ.
From CONDS, you can get CD.

Though I admit, i'm curious what you would want the CD sets for.

It would be for Loop Distribution where a partition is made up of the transitive closure of the dependent (data and control) instructions of the original seeding instructions.

Thanks,
Adam

Most things i've seen really want control-region info (so they can treat
all things equivalent as one large basic block) or things like SESE regions
(which can be computed in linear time from these sets) or the program
structure tree or whatever.
These are all computable directly from these sets.

Ah. I wouldn't try to do this kind of thing outside of polly, which
uses SCoP's (which is essentially a variant of the progarm structure
tree).
At the point you are doing loop distribution, you'd be better off
eating the cost of post-dominators or something.

The reason, as i understand it, that post-dominators is disfavored, is
that nobody preserves/updates it so it stays slow :)
If we fixed that problem, you'd get control dependence without issue.

Note that control dependence equivalence is much easier to maintain if
we wanted to, separately, (it can be maintained in sqrt(n) log n per
edge insertion/deletion) time

Gerolf added a subscriber: Gerolf.Mar 30 2015, 3:54 PM

Hi Daniel,
Is avoiding pdom cost your major motivator for choosing the cycle equivalence/PST algorithm?
If the pdom issues were solved would you pick APT over PST?

No preference implied. I’m only curious about what motivated your choice

Also, is nGVN the only client you envision currently?

Thanks
Gerolf

chandlerc requested changes to this revision.Apr 14 2015, 7:40 AM
chandlerc edited edge metadata.

I can't comment on the algorithmic side, but am happy to review the basic code.

First comment would be that if this is really only providing equivalence classes, it should be named appropriately. Probably ControlDependenceEquivalence or some such. Do that last though so the in-line comments aren't lost.

include/llvm/Analysis/ControlDependence.h
10–13

You need three '/'s for the doxygen stuff.

60–65

Is 'unsigned' enough? (I can believe it is, I just want to surface the question.)

lib/Analysis/ControlDependence.cpp
22–25

Why are these jammed in here with headers and usings?

30

Weird whitspace...

123–135

This looks like it could be cleaned up with early-continue patterns.

201–202

Missing vertical whitespace.

261–262

Keep braces symmetric.

312–322

Variable naming: Hi1Iter

377–378

Again, missing whitespace.

391–403

This code seems completely duplicated with the above?

411

TODO what?

434–435

Missing whitespace.

This revision now requires changes to proceed.Apr 14 2015, 7:40 AM
dberlin added inline comments.Apr 20 2015, 12:03 PM
include/llvm/Analysis/ControlDependence.h
60–65

It should be.
We will never have more classes than basic blocks, and we already assume O(BasicBlocks) fits in unsigned elsewhere in the compiler.

lib/Analysis/ControlDependence.cpp
22–25

Fixed.

30

Fixed.

123–135

I'll take a looksee and see what can be done

201–202

fixed

261–262

fixed

312–322

Fixed these

377–378

Fixed

391–403

Yes, moved to a helper function

411

Fixed
(incomplete deletion, it used to be "make this O(1)")

434–435

Fixed

I'll also rename it to ControlDependenceEquivalence.

dberlin edited edge metadata.

Update for review comments (whitespace fixes, refactoring of helpers and main DFS walk).

chandlerc added inline comments.May 27 2015, 2:29 AM
include/llvm/Analysis/ControlDependenceEquivalence.h
1–14 ↗(On Diff #24056)

This header block needs lots of fixes...

32 ↗(On Diff #24056)

Need a line break here. You can omit the '\brief' now apparently, but I don't care either way.

96–101 ↗(On Diff #24056)

I suggest using include/llvm/ADT/iterator.h and iterator_facade_base to stub out much of the boiler plate here. Also, I would use 'BaseT' rather than 'super'.

109 ↗(On Diff #24056)

No need for the inline keyword.

248 ↗(On Diff #24056)

BlockData is *huge*. It doesn't belong in a densemap.

I think you want a DenseMap for the class numbers, and maybe some of the other *very* hot data, and a separate DenseMap<..., std::unique_ptr<BlockData>> for the rest.

lib/Analysis/ControlDependenceEquivalence.cpp
11 ↗(On Diff #24056)

Missing line break.

327 ↗(On Diff #24056)

'hi0iter'?

354–356 ↗(On Diff #24056)

Why is this interesting?

dberlin abandoned this revision.Jul 6 2016, 9:25 PM