This is an archive of the discontinued LLVM Phabricator instance.

[Dominators] Add CFGBuilder testing utility
ClosedPublic

Authored by kuhar on Jun 28 2017, 6:30 PM.

Details

Summary

This patch introduces a new testing utility for building and modifying CFG -- CFGBuilder. The primary use case for the utility is testing the upcoming incremental dominator tree update API.

The current design provides a simple mechanism of constructing arbitrary graphs and then applying series of updates to them. CFGBuilder takes care of creating empty functions, connecting and disconnecting basic blocks. Under the hood it uses SwitchInst and UnreachableInst.

It will be also possible to create a thin wrapper over CFGBuilder for parsing string input and to hook it up to other textual tools (e.g. opt used with FileCheck).

Diff Detail

Repository
rL LLVM

Event Timeline

kuhar created this revision.Jun 28 2017, 6:30 PM
davide added a subscriber: davide.Jun 28 2017, 7:35 PM
dblaikie edited edge metadata.Jun 29 2017, 3:05 PM

Ah, hmm - I was assuming this was an API for building the CFG data structures. But I see it's an API for building IR that represents particular CFGs (so you can then test the CFG analysis that builds the data structures, etc, I assume?).

How much more terse is this than the hand-written IR? (I assume that's the motivation/benefit?) How much does that help - perhaps you could show (not in anything that needs to be committed) an example of the textual IR for some of the tests, to demonstrate how much more terse it is? But I guess it's obvious enough perhaps from the tabular update array format that it is quite compact.

unittests/IR/CFGBuilder.cpp
30 ↗(On Diff #104572)

"= default", and maybe a comment about why this is out of line (I'm assuming it's out of line/not implicit so that the header doesn't need to include the Module header, etc and can use forward declarations instead)

44 ↗(On Diff #104572)

I think you can rephrase

From->getParent()->getParent()->getContext()

to

From->getContext()

(also BB->getParent()->getParent() is equivalent to BB->getModule() I think?)

unittests/IR/CFGBuilder.h
59–61 ↗(On Diff #104572)

Generally operator overloads that can be non-members should be (ensures equal conversion handling for the operands/arguments)

60 ↗(On Diff #104572)

probably use std::tie rather than std::make_pair, so that copies aren't needed.

66 ↗(On Diff #104572)

It feels like the type name may be better to have a more meaningful name - rather than or in addition to the variable name (Operation? "Operation Op" ?)

67 ↗(On Diff #104572)

Variable and type names being the same can be tricky - but I realize sometimes it's the tidiest option. *shrug* :)

70–71 ↗(On Diff #104572)

Not sure if the aliases aid in readability (especially using the word 'list' in the names when they are vectors may given a mistaken impression about the performance of certain operations) - consider using the type names directly.

kuhar updated this revision to Diff 104771.Jun 29 2017, 4:19 PM
kuhar marked 6 inline comments as done.

Address David's comments.

dblaikie accepted this revision.Jun 29 2017, 4:26 PM
dblaikie added inline comments.
unittests/IR/CFGBuilder.cpp
33–35 ↗(On Diff #104771)

Oh, you could still have this inline in the header - just important that it's not a member function. It doesn't need to be a friend though, since this is a struct with only public members.

But if you do declare it as a friend (though it's not necessary/might be slightly confusing to a reader "why is this a friend if there's nothing private to access?") you can define it inline at the friend declaration - which keeps it written right where it was when it was a member function, close to the type it's defined for, etc, which is nice.

This revision is now accepted and ready to land.Jun 29 2017, 4:26 PM
kuhar added a comment.Jun 29 2017, 4:27 PM

Thanks for the review, David!

Ah, hmm - I was assuming this was an API for building the CFG data structures. But I see it's an API for building IR that represents particular CFGs (so you can then test the CFG analysis that builds the data structures, etc, I assume?).

I tried to clarify that in the class description -- does it help? Perhaps we should think of a better name, any ideas?

How much more terse is this than the hand-written IR? (I assume that's the motivation/benefit?) How much does that help - perhaps you could show (not in anything that needs to be committed) an example of the textual IR for some of the tests, to demonstrate how much more terse it is? But I guess it's obvious enough perhaps from the tabular update array format that it is quite compact.

Well, the thing is that writing the initial IR by hand is not that bad, but then you have to manually iterate over every basic block and assign it to a pointer (or a map of sort). The worst part is actually updating the IR within you test -- then you have to remember to update each terminator, handle edge cases when your successor is default switch case, change terminator to unreachable or return when you remove all the outgoing arcs, etc.

As an example, you can look at the ToT DominatorTreeTest.cpp. The IR modified there is really small, yet the update code still manages to pretty lengthy and thus not particularly readable. If you are really interested, I can generate some real-world examples that directly compare the two described approaches.

Thanks,
~Kuba

kuhar added inline comments.Jun 29 2017, 4:31 PM
unittests/IR/CFGBuilder.cpp
33–35 ↗(On Diff #104771)

The thing is that I keep Arcs in a std::set, which uses the less-than operator and I think that it needs to be declared before using it.
And if I go with declaring it inline, then I'll also have to pull in the entire <tuple>. I'm not sure if something includes it transitively -- if that's the case, I'm more than happy to keep it inline.

Here's a real-world example of possible use of the CFGBuilder:
https://reviews.llvm.org/D35341
https://reviews.llvm.org/D35342

CFGBuilder makes it easy not only write pretty concise test cases for the new incremental DomTree API, but also to generate different permutations of update sequences using standard algorithms -- which is not really possible if we decided to do the updates manually (without a library to automatically generate IR based on CFG description).

This revision was automatically updated to reflect the committed changes.