This is an archive of the discontinued LLVM Phabricator instance.

Separate LoopTraversal and BreakFalseDeps out of ExecutionDomainFix into their own files
ClosedPublic

Authored by myatsina on Nov 21 2017, 10:21 PM.

Details

Summary

Separate LoopTraversal and BreakFalseDeps into their own files.
Moving a couple of comments around to make the new code more readable (i.e. LoopTraversal algorithm is now descirbed in the LoopTraversal header file and not in the traverse() method implementation)

This change is NFC.

This is the forth of 5 patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869

Previous patches:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40332

Next patches:
https://reviews.llvm.org/D40334

Diff Detail

Repository
rL LLVM

Event Timeline

myatsina created this revision.Nov 21 2017, 10:21 PM
javed.absar added inline comments.Nov 22 2017, 2:05 AM
lib/CodeGen/BreakFalseDeps.cpp
60 ↗(On Diff #123866)

You could consider changing this to range-loop by using predecessors()

myatsina edited the summary of this revision. (Show Details)Nov 22 2017, 2:29 AM
myatsina set the repository for this revision to rL LLVM.
myatsina added a subscriber: llvm-commits.
myatsina marked an inline comment as done.Nov 23 2017, 8:10 AM
myatsina added inline comments.
lib/CodeGen/BreakFalseDeps.cpp
60 ↗(On Diff #123866)

I've addressed your comment in the first patch fix (https://reviews.llvm.org/D40330).
I think this change is more appropriate there.

MatzeB edited edge metadata.Nov 27 2017, 2:05 PM

Some comments on the software engineering side, I didn't really look into the dependency breaking algorithms yet.

include/llvm/CodeGen/BreakFalseDeps.h
36 ↗(On Diff #123866)

This is probably a bad use of inheritance. Saying "BreakFalseDeps is a LoopTraversal" feels odd. I believe this could be rewritten so that LoopTraversal is subclassed separately inside BreakFalseDeps.cpp without making it a supertype of BreakFalseDeps itself.

82 ↗(On Diff #123866)

It's usually easiest to just document the functions with /// doxygen comments instead of forming groups of methods.

If you really want to form groups, the correct doxygen syntax would be:

/// \name Dependency Breaking
/// \{
...
/// \}
include/llvm/CodeGen/LoopTraversal.h
12–39 ↗(On Diff #123866)
  • The comment (except for the first \file sentence) would better fit in front of class LoopTraversal.
  • It's good to see an example trying to explain the goals of the algorithm!
  • You can simplify the explanation by removing descriptions of how things are not done (i.e. the "it's not using the two-pass or naive worklist approaches" parts). Instead you should describe what actually is done: A reverse postorder traversal with an intertwined worklist algorithm for backedges.
51–67 ↗(On Diff #123866)

This looks to me like these datastructures could be private.

73–81 ↗(On Diff #123866)

This API seems too complicated to me. Given that this is just a loop traversal API I would expect that traverse() and runOnBasicBlock() is enough.

See also my comment inside runOnBasicBlock(); isBlockDone() can be private so it is not part of the API I believe.

lib/CodeGen/LoopTraversal.cpp
18–19 ↗(On Diff #123866)

It is usually more convenient for users of the API if the doxygen comments are in the header; similar for following methods.

21–23 ↗(On Diff #123866)

As far as I can see all three of enterBasicBlock(), processBasicBlock(), leaveBasicBlock() are always called together in this order. So it could just as well be a single function. In fact I believe just adding virtual to runOnBasicBlock() and removing the three other methods would suffice.

36 ↗(On Diff #123866)

Using MachineBasicBlock & instead of auto & is friendlier to people reading the code.

45–48 ↗(On Diff #123866)

You could simplify this to:

for (MachineBasicBlock *MBB : RPOT) {
   // ...
}

Same with the loop after this one.

63–71 ↗(On Diff #123866)

Don't use {} for 1 line ifs.

78–80 ↗(On Diff #123866)

Can you explain how this happens? Both loops appear to be using a RPO traversal, how could the first one miss some that this 2nd pass catches?

MatzeB added inline comments.Nov 27 2017, 2:12 PM
include/llvm/CodeGen/LoopTraversal.h
66 ↗(On Diff #123866)

You can use a less expensive SmallVector<> or IndexedMap<> instead of a DenseMap by using the fact that machine basic blocks have a dense numbering. See for example MachineTraceMetrics::BlockInfo or LiveRangeCalc::Map for examples on how this is can be done.

myatsina marked an inline comment as done.Nov 29 2017, 3:33 PM

Hi,

Thanks for the comments!
My changes are mostly refactoring of the old ExecutionDepsDix pass in a way that is (almost*) NFC.
This pass combined the logic of both breaking false dependencies and execution domain adjustments in one single pass that traversed the basic blocks in a very special way (optimizing traversal of loops).
I think the first review (https://reviews.llvm.org/D40330) in this bunch would make everything clearer for you.

In order to fix the bugzilla 33869 I had to make a small adjustment for the breaking false dependencies part, but because all this logic was inter-twined I had to first separate it.
The new logic I've added is pretty much summarized here https://reviews.llvm.org/D40334.

*almost NFC - The logic I've changed in all this refactoring compared to the original ExecutionDepsFix is that I've enabled the "break false dependecy" part of the logic to run on all register classes, and not only the register class that ExecutionDepsFix received in its c'tor.

I will definitely adopt some of you refactoring suggestions and upload a patch soon.

Thanks,
Marina

include/llvm/CodeGen/BreakFalseDeps.h
36 ↗(On Diff #123866)

Both BreakFalseDeps and ExecutionDomainFix traverse the basic blocks in a special way.
The LoopTraversal provides the traversal algorithm and methods that allow inheriting passes to handle instructions, handle BBs, etc.
Do you have a better suggestion on how to do it without duplicating the traversal logic in both passes?

82 ↗(On Diff #123866)

Some of these methods are documented in the cpp file.
This is how they were originally documented when they were part of the ExecutionDepsFix pass.
Do you think we take this refactoring a a chance to move all method documentation to the header file?

include/llvm/CodeGen/LoopTraversal.h
12–39 ↗(On Diff #123866)

I think that mentioning the naive approach and what we are trying to optimize is actually making this traversal algorithm (and the motivation for it) much clearer.
I can refactor the comment so that we have "The algorithm does ..." and at the end add a few sentences like "A naive way of implementing this would have been ..., but the downside of this is ..., this is why the optimized way was chosen", or something of that sort.
Do you think this way will be better?

73–81 ↗(On Diff #123866)

I was trying to avoid duplicating runOnBasicBlock() logic for the inheriting classes.
It is not a lot of duplication, it basically just activates the other 3 methodes (enterBB, processBB, leaveBB) and that's it, so we can simplify the API as you suggested and have both inheriting class implement runOnBB the same way.

lib/CodeGen/LoopTraversal.cpp
18–19 ↗(On Diff #123866)

Sure, I'll move them.
I've was trying to keep it close to what was in the original ExecutionDepsFix header and cpp..

21–23 ↗(On Diff #123866)

I can definitely do that, just keep in mind that this way you will see some duplication because both inheriting classes will implement it the same way (calling their own private enterBB, processBB and leaveBB).

78–80 ↗(On Diff #123866)

Keno Fischer (@loladiro ) is the author of this code in ExecutionDepsFix.
Keno, can you explain?

MatzeB added inline comments.Dec 1 2017, 11:55 AM
include/llvm/CodeGen/BreakFalseDeps.h
1–18 ↗(On Diff #123866)

Is there a need to make the whole class public in a header? It feels to me like a FunctionPass *createFalseDependencyPass(); in include/llvm/CodeGen/Passes.h should be enough here so the class definition and all the details can remain in BreakFalseDeps.cpp.

36 ↗(On Diff #123866)

Factoring out the common logic makes a lot of sense and is a good thing.

I was mainly wondering about using subclassing as a tool to get there.

In this case it could be as simple as a single function as the public interface:

void dominanceLoopTraversal(MachineFunction &MF,
                            std::function<void(MachineBasicBlock *, bool)> runOnBasicBlock);

For the implementation you could still use a class (in the .cpp file) if that feels easier:

class LoopTraversalImpl {
  std::function... runOnBasicBlock;
  LoopTraversalImpl(std::function... runOnBasicBlock) : runOnBasicBlock(runOnBasicBlock) {}
  // ...
  void traverse(MachineFunction &MF) { /* as existing */ }
};

void dominanceLoopTraversal(MachineFunction &MF,
                            std::function<void(MachineBasicBlock *, bool)> runOnBasicBlock) {
  LoopTraversalImpl Traversal(runOnBasicBlock);
  Traversal.traverse(MF);
}
82 ↗(On Diff #123866)

Having the documentation for private methods in the cpp file is fine. But then I'd be consistent and not have something like // Instruction processing. here which looks like documentation about the function.

On the other hand I suppose the // LoopTraversal overrides. comment is about the 3 methods following the comment. It is at least not very typical to document 3 methods with a single comment, while easy in this case in general it probably won't be obvious what groups of methods a comment is meant for. That's why I was talking about the doxygen groups. (Though in this specific case I'd tend to just not have a comment...)

include/llvm/CodeGen/LoopTraversal.h
1 ↗(On Diff #123866)

Could we move it to lib/CodeGen/LoopTraversal.h and keep the API private to CodeGen for now?

12–39 ↗(On Diff #123866)

Yes that sounds good.

52 ↗(On Diff #123866)

This should be /// Whether ... in doxygen syntax.

73–81 ↗(On Diff #123866)

The downside of having 3 functions IMO is that splitting into 3 function is in no way motivated by the fact that this is a loop traversal algorithm. It's just a small convenience that happens to match the two existing users. The downside from splitting is that a reader of the code has to wonder when each of them gets called and at least I would intuititvely assume there to be complicated rules, while if there is only a single function it is quite obvious what will happen.

myatsina updated this revision to Diff 126843.Dec 13 2017, 2:35 PM

I've uploaded a new review here and in https://reviews.llvm.org/D40330.

Changed LoopTraversal to not be a pass, but rather a class which returns the BB order and other passes can then traverse and process the BBs.
Separated some logic to a ReachingDefAnalysis pass which is used by both BreakFalsDeps and ExecutionDomainFix.

Between review D40330 and this review of deparating the code to different files I did some refactoring which is NFC (changing data types, moving comments etc). If you want I can upload the refactoring for review as well.

atdt added a subscriber: atdt.Dec 15 2017, 1:08 PM

Matthias, did you get a chance to look at these changes (and at the changes in https://reviews.llvm.org/D40330) ? Are you OK with them?

Thanks,
Marina

MatzeB accepted this revision.Jan 8 2018, 11:31 AM

LGTM.

This revision is now accepted and ready to land.Jan 8 2018, 11:31 AM
This revision was automatically updated to reflect the committed changes.