Page MenuHomePhabricator

DomTree: Extract (mostly) read-only logic into type-erased base classes
Needs ReviewPublic

Authored by nhaehnle on Jul 2 2020, 2:09 PM.

Details

Summary

Avoid having to instantiate and compile a subset of the dominator tree logic
separately for each node type. More importantly, this allows generic
algorithms to be built on top of dominator trees without writing them as
templates -- such algorithms can now use opaque CfgBlockRef and
CfgInterface instead.

A type-erased implementation of dominator trees could be written in
terms of CfgInterface as well, but doing so would change the current
trade-off: it would slightly reduce code size at the cost of a slight
runtime overhead.

This patch does not change the trade-off, as it only does type-erasure
where basic blocks can be treated in a fully opaque way, i.e. it only
moves methods that don't require iteration over CFG successors and
predecessors.

Change-Id: Ib860dc04cf8bb093d8ed00be7def40d662213672

Diff Detail

Event Timeline

nhaehnle created this revision.Jul 2 2020, 2:09 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 2 2020, 2:09 PM
arsenm added inline comments.Jul 3 2020, 8:07 AM
llvm/include/llvm/Support/GenericDomTree.h
79–80

Iterating over "generic" seems like a strange naming choice? The generic_children range is a bit better, but this should probably match

nhaehnle updated this revision to Diff 275812.Jul 6 2020, 1:03 PM
nhaehnle marked an inline comment as done.
  • rename generic_{begin,end,children} back without the generic_ prefix and refer explictly to base class methods in NewGVN, which wants to mutate the order of dominator tree node children directly
nhaehnle added inline comments.Jul 6 2020, 1:06 PM
llvm/include/llvm/Support/GenericDomTree.h
79–80

Okay, so the reason why I originally did this was that NewGVN wants to change the order of children (for what seems like dubious reasons, but I didn't want to touch *that* too), and providing non-const iteration from DomTreeNodeBase<T> is problematic. A cleaner way is to have NewGVN explicitly refer to the base class begin/end methods.

arsenm added a reviewer: fhahn.Jul 6 2020, 1:22 PM
kuhar added a comment.Jul 6 2020, 7:43 PM

Overall, this seems like a good idea to me. The amount of templated code started growing out of hand some time ago, to the point where it's really hard to make logically changes, especially in the generic updater code.

This part of the code is *very* performance sensitive and definitely needs benchmarking before moving forward. Have you tried doing some performance evaluation on this change? I suggest compiling a few mid to large size programs (e.g., sqlite, webassembly, opt, clang, rippled) and compiling them into whole-program bitcode, and then running opt -O3 on this bitcode. This is pretty easy with gllvm; I can dig up my old instruction if that would help.

Nit: please fix the linter warnings.

This part of the code is *very* performance sensitive and definitely needs benchmarking before moving forward. Have you tried doing some performance evaluation on this change? I suggest compiling a few mid to large size programs (e.g., sqlite, webassembly, opt, clang, rippled) and compiling them into whole-program bitcode, and then running opt -O3 on this bitcode. This is pretty easy with gllvm; I can dig up my old instruction if that would help.

I've done this now: used gllvm to extract bitcode of Z3, then run perf stat opt -O3 z3.bc -o z3.out.bc. 3 runs on both my branch and the underlying master:

Underlying master:
 1.441.691.613.522      cycles:u                  #    3,948 GHz                      (83,34%)
 1.511.470.542.186      instructions:u            #    1,05  insn per cycle
 1.445.040.063.358      cycles:u                  #    3,832 GHz                      (83,33%)
 1.511.151.342.715      instructions:u            #    1,05  insn per cycle
 1.445.488.489.379      cycles:u                  #    3,851 GHz                      (83,34%)
 1.510.528.339.517      instructions:u            #    1,04  insn per cycle

My branch:
 1.447.208.247.196      cycles:u                  #    3,920 GHz                      (83,33%)
 1.506.361.797.982      instructions:u            #    1,04  insn per cycle
 1.451.415.961.317      cycles:u                  #    3,913 GHz                      (83,33%)
 1.505.605.385.713      instructions:u            #    1,04  insn per cycle
 1.445.005.341.369      cycles:u                  #    3,908 GHz                      (83,33%)
 1.506.364.033.713      instructions:u            #    1,04  insn per cycle

So the results are a bit unintuitive: lower number of instructions overall on my branch, slightly higher average number of cycles.

kuhar accepted this revision.Jul 8 2020, 12:00 PM

Thanks for checking this. I dug up some old whole-program bitcode and uploaded it here in case it helps with future code reviews: https://drive.google.com/drive/folders/1VJpym19cW-8BVgdtl2MsD3zB4CoEQ93O?usp=sharing

I did a quick experiment I run the trunk and your patch on a few binaries from each project and got the number below. I only set the performance governor to performance but didn't do anything fancy to fix the clocks, disable SMT, or any core pinning. My machine has two Xeon 6154 Skylake CPUs.

BranchAvg. [s]StdRun 1 [s]Run 2 [s]Run 3 [s]Run 4 [s]Run 5 [s]Delta [%]
sqlite3.bctrunk15.9220.101115.8215.8215.9815.9416.05
nicolai15.9580.052215.9716.0315.9715.9315.890.23%
llvm-as.bctrunk207.8682.7855209.82210.73206.72203.69208.38
nicolai207.011.7279209.37205.98207.75207.12204.83-0.41%
wasm-as.bctrunk43.5960.288543.0843.7343.7243.7243.73
nicolai43.6960.168943.8243.4943.9143.6643.60.23%
wasm-opt.bctrunk46.9560.138347.0647.1446.9146.8246.85
nicolai46.9980.455846.4646.5947.2947.1347.520.09%

Seems performance-neutral to me.

This revision is now accepted and ready to land.Jul 8 2020, 12:00 PM
nhaehnle updated this revision to Diff 280482.Jul 24 2020, 9:13 AM
v6:
- style change: iDom -> idom; it's arguable whether this is really
  invalid, since it is actually standard camelCase, but clang-tidy
  complains about it so... *shrug*
- rename {to,from}Generic -> {wrap,unwrap}Ref
This revision was landed with ongoing or failed builds.Oct 20 2020, 10:53 AM
This revision was automatically updated to reflect the committed changes.
nhaehnle reopened this revision.Dec 9 2020, 2:46 AM
This revision is now accepted and ready to land.Dec 9 2020, 2:46 AM
nhaehnle updated this revision to Diff 310472.Dec 9 2020, 2:46 AM

v9:

  • rebase on factored-out Handle infrastructure
kuhar resigned from this revision.May 16 2022, 10:59 AM
This revision now requires review to proceed.May 16 2022, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 16 2022, 10:59 AM
RKSimon resigned from this revision.May 18 2022, 3:45 AM