Check that all passes, which report they preserve CFG, are really preserving CFG.
A new standard instrumentation is introduced. It can be switched on/off by the flag verify-cfg-preserved, which is on by default for debug builds.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I'm not sure if introducing base for standard instrumentation classes (Instrument/Instruments) is a good change or not.
What problem does it solve?
Anyway that part seems to be rather orthogonal to the checker itself.
I would like to have it separated - either introduce Instrument/Instruments thing before the checker and then use it,
OR implement the checker the same way as other standard instrumentations are done (just as separate members) and only then refactor it all into Instrument/Instruments.
Ok.
I do not like that the current design makes me expose of a new standard instrumentation class declaration for all users of StandardInstrumentations. I believe it can be done internally inside StandardInstrumentations.cpp. That is why I proposed the Instrument base class.
I'm still not sure that this patch is needed. So, please feel free to comment. I used this patch once to see if there are any PreservedCFG violating passes except InstCombine. I did not find any. This does not prove there is none, but some level of confidence it gives.
llvm/lib/Passes/StandardInstrumentations.cpp | ||
---|---|---|
298 | I realized that it is unsafe to dereference any of the saved BasicBlock* at the validation check. |
Removed the class Instrument and moved the class declaration of PreservedCFGCheckerInstrumentation from cpp to h file.
Relaxed CFG check for successors order (to allow branch successors swap).
Made CFG diff printing safe for possible block removal.
Fixed push-pop CFG to/from stack and a parallel stack with pass names for assertions.
Rebased.
I think this deserves some extra documentation that explains what it means to preserve CFG in plain English. For example, should we also care about preserving the number of return blocks?
llvm/include/llvm/Passes/StandardInstrumentations.h | ||
---|---|---|
87 | CFG is a multigraph. A basic block can have duplicate successors, i.e., there may be many outgoing edges to the same basic block. | |
87 | Is tracking BasicBlock* enough? What if blocks get deleted and the tracked pointers become invalid? Does it make sense to use WeakVH instead? |
llvm/include/llvm/Passes/StandardInstrumentations.h | ||
---|---|---|
87 | Ok. I had in mind checking PostDomTree invariant that must be insensitive to parallel edges. | |
87 | To not complicate things with WeakVH I decided to not dereference the blocks collected before the checked pass unless I see the same pointer in the CFG after the pass. See the printCFGdiff(). |
llvm/include/llvm/Passes/StandardInstrumentations.h | ||
---|---|---|
87 | AFAIK in C++, when pointee's lifetime ends, the pointer becomes invalid and you cannot use it in any meaningful way, even when you don't dereference it. It'd be incorrect to print it, compare with another pointers, etc. This would also make debugging very confusing. |
Defined a separate class named CFG.
CFG defines the notion PreserveCFG for the multigraph of BasicBlocks with unordered successors.
CFG can safely track and print CFG changes. To guard CFG from accessing deleted blocks PoisoningVH is used and is extended with one getter.
The cfg diff printing function is rewritten.
Rebased.
Here is an example of an output of a unexpected CFG change detected after SimplifyCFG mistakenly reported PreservedCFG.
$ opt -passes=simplifycfg -S -verify-cfg-preserved --debug-only=stdinstrumentations
before SimplifyCFG
define i32 @test5(i32 %A) { switch i32 %A, label %return [ i32 2, label %return i32 10, label %2 i32 9, label %1 i32 8, label %2 ] ret i32 1 ret i32 2 return: ret i32 0 }
after SimplifyCFG
define i32 @test5(i32 %A) { switch i32 %A, label %return [ i32 8, label %2 i32 10, label %2 i32 9, label %1 ] 1: ; preds = %return, %2, %0 %merge = phi i32 [ 1, %0 ], [ 2, %2 ], [ 0, %return ] ret i32 %merge 2: ; preds = %0, %0 br label %1 return: ; preds = %0 br label %1 }
Mocked error
Error: SimplifyCFGPass reported it preserved CFG, but changes detected: In function @test5 Different number of non-leaf basic blocks: before=1, after=3 Non-leaf block return<0x23931c0> is added (1 successors) Different successors of block entry<0x2393130> (unordered): - before (3): return<0x23931c0>(2), unnamed_1<0x2393990>, unnamed_2<0x23938d0>(2), - after (3): return<0x23931c0>, unnamed_1<0x2393990>, unnamed_2<0x23938d0>(2), Non-leaf block unnamed_2<0x23938d0> is added (1 successors) LLVM ERROR: Preserved CFG changed by SimplifyCFGPass
This looks very useful. Does it currently report any errors when enabled?
llvm/include/llvm/Passes/StandardInstrumentations.h | ||
---|---|---|
96 | nit: NeedsGruard sounds like an external property that you need to satisfy for some reason. In this case, it seems to me that it's just an extra level of validation that we may want, right? | |
llvm/lib/Passes/StandardInstrumentations.cpp | ||
41 | How much overhead does this have? It it's not negligible, we may want to enable it when expensive checks are enabled. | |
394 | nit: !empty() or size() > 0 | |
396 | same as above |
None.
llvm/lib/Passes/StandardInstrumentations.cpp | ||
---|---|---|
41 | By default I proposed to enable it only for debug version. Do we track compile time for them? |
llvm/lib/Passes/StandardInstrumentations.cpp | ||
---|---|---|
41 | In general, I think we do care about it too. Could you run opt on some bigger .bc file to make sure there's little noticeable difference? I posted some .bc files in https://reviews.llvm.org/D83089. |
llvm/include/llvm/Passes/StandardInstrumentations.h | ||
---|---|---|
80–81 | nit: could you move these just next to the forward-declaration for Module? |
I will collect performance data and fix for release builds.
llvm/include/llvm/IR/ValueHandle.h | ||
---|---|---|
506 ↗ | (On Diff #289100) | bug: Poisoned is defined only for debug builds. I will define a custom CallbackVH extension. |
Implemented custom BBGuard class instead of PoisoningVH which does not track poisoning in release builds.
Removed DEBUG_TYPE and LLVM_DEBUG uses. Unexpected gaph diffs are printed unconditionally.
Addressed comments.
I will provide performance data soon.
I see no impact on CPU time with the following timed runs (both with and without assertions):
$ time opt -O3 FILE.bc --disable-output -verify-cfg-preserved=VCP
LLVM_ENABLE_ASSERTIONS=ON:
FILE | verify-cfg-preserved | Run 1 [s] | Run 2 [s] | Run 3 [s] | Run 4 [s] | Run 5 [s] | Avg | Std/Avg % | DeltaAvg% |
---|---|---|---|---|---|---|---|---|---|
llvm-as.bc | false | 549.010 | 544.270 | 550.010 | 547.620 | 547.360 | 547.654 | 0.40 | |
true | 548.640 | 547.670 | 547.690 | 548.900 | 548.680 | 548.316 | 0.11 | 0.12 | |
sqlite3.bc | false | 58.120 | 58.180 | 58.210 | 58.130 | 58.190 | 58.166 | 0.07 | |
true | 58.230 | 58.120 | 58.130 | 58.240 | 58.210 | 58.186 | 0.10 | 0.03 | |
wasm-as.bc | false | 168.880 | 169.490 | 168.610 | 169.770 | 168.150 | 168.980 | 0.39 | |
true | 168.660 | 169.730 | 169.360 | 169.620 | 168.500 | 169.174 | 0.33 | 0.11 | |
wasm-opt.bc | false | 181.860 | 180.550 | 181.340 | 180.790 | 181.910 | 181.290 | 0.34 | |
true | 181.240 | 181.620 | 181.180 | 181.780 | 180.640 | 181.292 | 0.24 | 0.00 | |
LLVM_ENABLE_ASSERTIONS=OFF:
FILE | verify-cfg-preserved | Run 1 [s] | Run 2 [s] | Run 3 [s] | Run 4 [s] | Run 5 [s] | Avg | Std/Avg % | DeltaAvg% |
---|---|---|---|---|---|---|---|---|---|
llvm-as.bc | false | 408.650 | 408.730 | 410.100 | 406.650 | 410.110 | 408.848 | 0.35 | |
true | 408.980 | 408.260 | 408.410 | 409.180 | 408.600 | 408.686 | 0.09 | -0.04 | |
sqlite3.bc | false | 39.720 | 39.770 | 39.790 | 39.720 | 39.780 | 39.756 | 0.08 | |
true | 39.690 | 39.730 | 39.740 | 39.760 | 39.750 | 39.734 | 0.07 | -0.06 | |
wasm-as.bc | false | 117.270 | 117.460 | 117.450 | 117.410 | 117.250 | 117.368 | 0.09 | |
true | 117.640 | 117.100 | 117.400 | 117.210 | 117.310 | 117.332 | 0.18 | -0.03 | |
wasm-opt.bc | false | 125.900 | 126.030 | 125.910 | 126.020 | 126.180 | 126.008 | 0.09 | |
true | 125.940 | 125.930 | 126.100 | 125.980 | 125.920 | 125.974 | 0.06 | -0.03 | |
Thanks for the table, this looks good to me now, modulo the two remaining nits. I'd be more comfortable if somebody familiar with the new pass manager reviewed this too.
Fedor, could you please review the New Pass Manager part of this patch as I addressed your comments?
NewPM use seems to be fine.
A few relatively minor comments that I would like to see addressed.
llvm/include/llvm/Passes/StandardInstrumentations.h | ||
---|---|---|
89 | Are we ok with adding a generic 'CFG' class directly into llvm namespace by StandardInstrumentations.h? If you can see this class being generally useful then it deserves its own header/location. | |
120–121 | It seems that GraphStack and PassStack are always pushed/popped synchronously. | |
llvm/lib/Passes/StandardInstrumentations.cpp | ||
486 | This seems to be a more new-PM-idiomatic way of checking: PassPA.allAnalysesInSetPreserved<CFGAnalyses>() |
llvm/include/llvm/Passes/StandardInstrumentations.h | ||
---|---|---|
89 | That is why I tried introducing the class Instrument that would encapsulate all declarations and implementations of PrintIRInstrumentation, PrintPassInstrumentation, TimePassesHandler, OptNoneInstrumentation, PreservedCFGCheckerInstrumentation in one cpp file :) |
nit: could you move these just next to the forward-declaration for Module?