This is an archive of the discontinued LLVM Phabricator instance.

Add option to enable/disable LiveDebugValues pass.
AbandonedPublic

Authored by tvvikram on Jan 6 2016, 9:34 PM.

Details

Summary

It is 'on' by default.

Diff Detail

Event Timeline

tvvikram updated this revision to Diff 44190.Jan 6 2016, 9:34 PM
tvvikram retitled this revision from to Add option to enable/disable LiveDebugValues pass..
tvvikram updated this object.
tvvikram added reviewers: aprantl, samsonov, kcc.
kcc edited edge metadata.Jan 6 2016, 9:39 PM

please make it false by default since the phase is not really working (compilation times out)

Can we actually add an equivalent clang debug tuning option called "-fvar-tracking" similar to the gcc one here as well?

-eric

In D15945#321093, @kcc wrote:

please make it false by default since the phase is not really working (compilation times out)

I kept it 'on' as it works in most of the cases. I will try to submit a patch by EoD that fixes the issue.

tvvikram updated this revision to Diff 44196.Jan 7 2016, 1:49 AM
tvvikram edited edge metadata.

I noticed with a.ll, the test case from https://llvm.org/bugs/show_bug.cgi?id=26055, that there are too many DEBUG_VALUE instructions getting inserted. Statistics showed >6L instructions getting inserted:
616584 live-debug-values - Number of DBG_VALUE instructions inserted

One reason is the presence of too many basic blocks within the function. With so many blocks, the convergence takes really long time. I thought of ignoring functions with more basic blocks. But instead, I have added a simple heuristic to stop convergence after certain number of iterations. Currently the max iterations is arbitrary (chosen to be 10).

Can we actually add an equivalent clang debug tuning option called "-fvar-tracking" similar to the gcc one here as well?

-eric

+1!

samsonov edited edge metadata.Jan 7 2016, 9:38 AM

Note that test case in https://llvm.org/bugs/show_bug.cgi?id=26055 is produced from auto-generated code, so its BB structure should be very regular, and you can come up with rather precise estimate of how many passes you will need to converge. Does current bound (10) make compile time reasonable for that case?

-fvar-tracking: yes, please! It would be nice to be able to toggle this pass from frontend.

I have a number of comments regarding the code as well, but I will rather see quick fixes (or revert) committed ASAP: this is hurting us pretty badly, as we can't build several projects with optimization and debug info enabled.

Note that test case in https://llvm.org/bugs/show_bug.cgi?id=26055 is produced from auto-generated code, so its BB structure should be very regular, and you can come up with rather precise estimate of how many passes you will need to converge. Does current bound (10) make compile time reasonable for that case?

Yes. The compile time is now almost equal (~0.8 sec on my machine).

Okay, i'm confused.

(Note: I helped write var-tracking for GCC).

Let's ignore the inefficient data structure usage here (ie appending lists instead of using constant time splices, using lists instead of hash maps, etc).

The reason your convergence is slow, is, AFAICT, because you are using a random basic block order for processing the dataflow problem.

You should be walking blocks in reverse postorder, which is the optimal order for this problem, and will significantly speed up convergence.

I'm also going to take a pass at writing comments on the algorithmic issues here.

dberlin added inline comments.Jan 8 2016, 4:20 PM
lib/CodeGen/LiveDebugValues.cpp
299

Can you please explain, precisely, what this accomplishes?
Besides badly needing a comment explaining what is going on,
it is really complicated, and if i'm reading it right, really slow (N^2 or worse) for the datastructures you have here.

323

This looks like another N^2 part.
You either should be keeping these things in a hash map or at the very least, and ordered set so that you can do this fast.

390

Please use a SmallPtrSet<BasicBlock *, 16> Visited to to track whether you visited a block or not, don't go searching through the worklist.

dberlin added inline comments.Jan 8 2016, 4:35 PM
lib/CodeGen/LiveDebugValues.cpp
83

There is no reason for this to be a global variable.
Please just make your transfer functions return bools and | them together.

85

Ditto.

374

You really need to be iterating in reverse postorder.

One trivial way to do that is to use the ordering and two worklists.

std::priority_queue <unsigned int> worklist;
std::priority_queue <unsigned int> pending;

DenseMap<MachineBasicBlock *, unsigned int> BBToOrder;
DenseMap<unsigned int, MachineBasicBlock *> OrderToBB

ReversePostOrderTraversal<MachineBasicBlock *> RPOT(entryblock);
unsigned int RPONumber = 0;
for (auto I = RPOT.begin(), auto E = RPOT.end(); I != E; ++I) {
  BBToOrder[*I] = RPONumber;
  OrderToBB[RPONumber] = *I;
  worklist.push_back(RPONumber);
  ++RPONumber;
}


while (changed) {
<go through worklist, use OrderToBB[I] to get BB from worklist> 
put necessary successors on pending using BBToOrder.
std::swap(worklist, pending)
Clear pending.
}
tvvikram abandoned this revision.Jan 9 2016, 11:22 PM

Thanks @dberlin for the suggestions and working on it (http://reviews.llvm.org/D16039)