Page MenuHomePhabricator

[analyzer][Liveness][NFC] Enqueue the CFGBlocks post-order
Needs ReviewPublic

Authored by Szelethus on Sep 11 2020, 9:12 AM.

Diff Detail

Event Timeline

Szelethus created this revision.Sep 11 2020, 9:12 AM
Szelethus requested review of this revision.Sep 11 2020, 9:12 AM
xazax.hun added inline comments.Sep 11 2020, 11:10 AM
clang/lib/Analysis/LiveVariables.cpp
522

With BackwardDataflowWorklist, each enqueueBlock will insert the block into a llvm::PriorityQueue. So regardless of the insertion order, dequeue will return the nodes in the reverse post order.

Inserting elements in the right order into the heap might be beneficial is we need to to less work to "heapify". But on the other hand, we did more work to insert them in the right order, in the first place. All in all, I am not sure whether the comment is still valid and whether this patch would provide any benefit over the original code.

martong added inline comments.Sep 14 2020, 6:43 AM
clang/lib/Analysis/LiveVariables.cpp
522

Yes, what Gabor says makes sense. On the other hand I don't see any overhead - I might be wrong though - in the post order visitation. And it makes the code more consistent IMHO.

Well, it would be important to know why the original author put the

// FIXME: we should enqueue using post order.

there. The blamed commit 77ff930fff15c3fc76101b38199dad355be0866b is not saying much.

clang/lib/Analysis/LiveVariables.cpp
522

Please compare the execution time with and without this patch. I think it is difficult do decide in theory which one costs more: the heapification during insertion or the reverse ordering before the insertion.

I don't insist on this patch, though I will end up removing the FIXME even if I leave the actual code unchanged, as it seems to be outdated.

clang/lib/Analysis/LiveVariables.cpp
522

I think the performance hit, given that there is any, must be negligible.

On the project c4, a tiny C compiler written in 4 large functions, the current liveness analysis runtimes on my machine in debug mode look like this:

Liveness analysis on next: 7.397604e-03s
Liveness analysis on expr: 1.203549e-02s
Liveness analysis on stmt: 5.470070e-04s
Liveness analysis on main: 1.344798e-02s
Liveness analysis on main: 1.415095e-02s
Liveness analysis on stmt: 5.550660e-04s
Liveness analysis on expr: 1.197334e-02s
Liveness analysis on next: 7.181059e-03s

After this patch, they look like this:

Liveness analysis on next: 7.313751e-03s
Liveness analysis on expr: 1.211920e-02s
Liveness analysis on stmt: 5.582670e-04s
Liveness analysis on main: 1.372210e-02s
Liveness analysis on main: 1.437104e-02s
Liveness analysis on stmt: 5.685340e-04s
Liveness analysis on expr: 1.269498e-02s
Liveness analysis on next: 7.094738e-03s

Mind that this measured the entire analysis, not just enqueuing. I think its fair to say that even on relatively large functions, the difference is within the margin of error, and is most definitely incomparable to its only user's runtime, the static analyzer itself.

Well, it would be important to know why the original author put the [TODO] there.

Yep, I think its just an outdated comment.

What about analyzing a Sema translation unit? That should be beefy enough to have a longer runtime.