This is an archive of the discontinued LLVM Phabricator instance.

[NewPM] Introduce PreserveCFG check
ClosedPublic

Authored by yrouban on Jun 10 2020, 5:40 AM.

Details

Summary

Check that all passes, which report they preserve CFG, are really preserving CFG.
A new standard instrumentation is introduced. It can be switched on/off by the flag verify-cfg-preserved, which is on by default for debug builds.

Diff Detail

Event Timeline

yrouban created this revision.Jun 10 2020, 5:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2020, 5:40 AM
Herald added a subscriber: hiraditya. · View Herald Transcript

I'm not sure if introducing base for standard instrumentation classes (Instrument/Instruments) is a good change or not.
What problem does it solve?
Anyway that part seems to be rather orthogonal to the checker itself.
I would like to have it separated - either introduce Instrument/Instruments thing before the checker and then use it,
OR implement the checker the same way as other standard instrumentations are done (just as separate members) and only then refactor it all into Instrument/Instruments.

yrouban marked an inline comment as done.Jun 11 2020, 2:02 AM

I'm not sure if introducing base for standard instrumentation classes (Instrument/Instruments) is a good change or not.
What problem does it solve?
Anyway that part seems to be rather orthogonal to the checker itself.
I would like to have it separated - either introduce Instrument/Instruments thing before the checker and then use it,
OR implement the checker the same way as other standard instrumentations are done (just as separate members) and only then refactor it all into Instrument/Instruments.

Ok.
I do not like that the current design makes me expose of a new standard instrumentation class declaration for all users of StandardInstrumentations. I believe it can be done internally inside StandardInstrumentations.cpp. That is why I proposed the Instrument base class.

I'm still not sure that this patch is needed. So, please feel free to comment. I used this patch once to see if there are any PreservedCFG violating passes except InstCombine. I did not find any. This does not prove there is none, but some level of confidence it gives.

llvm/lib/Passes/StandardInstrumentations.cpp
257

I realized that it is unsafe to dereference any of the saved BasicBlock* at the validation check.
But derefrence is only needed to print the difference when we are goin to crash anyway.
Another problem is that we might think that there are no difference if CFG has changed but BasicBlock* happened to be the same. It seems this patch needs to track values.

yrouban updated this revision to Diff 281477.Jul 28 2020, 11:00 PM
yrouban edited the summary of this revision. (Show Details)

Removed the class Instrument and moved the class declaration of PreservedCFGCheckerInstrumentation from cpp to h file.
Relaxed CFG check for successors order (to allow branch successors swap).
Made CFG diff printing safe for possible block removal.
Fixed push-pop CFG to/from stack and a parallel stack with pass names for assertions.
Rebased.

yrouban updated this revision to Diff 281884.Jul 30 2020, 5:05 AM

Rebased on top of D84772. So D84826 is not needed.
Fixed safety of printCFGdiff() to handle removed blocks.

kuhar requested changes to this revision.Aug 4 2020, 11:40 AM

I think this deserves some extra documentation that explains what it means to preserve CFG in plain English. For example, should we also care about preserving the number of return blocks?

llvm/include/llvm/Passes/StandardInstrumentations.h
71

CFG is a multigraph. A basic block can have duplicate successors, i.e., there may be many outgoing edges to the same basic block.
Shouldn't this be a map<BB, set<pair<BB, count>> or map<BB, multiset<BB, count>>?

71

Is tracking BasicBlock* enough? What if blocks get deleted and the tracked pointers become invalid? Does it make sense to use WeakVH instead?

This revision now requires changes to proceed.Aug 4 2020, 11:40 AM
yrouban added inline comments.Aug 4 2020, 10:37 PM
llvm/include/llvm/Passes/StandardInstrumentations.h
71

Ok. I had in mind checking PostDomTree invariant that must be insensitive to parallel edges.
I will try to make it stricter as you suggest.

71

To not complicate things with WeakVH I decided to not dereference the blocks collected before the checked pass unless I see the same pointer in the CFG after the pass. See the printCFGdiff().
WeakVH would allow to check accurately, but could not be used as a set/map key.

kuhar added inline comments.Aug 5 2020, 7:59 AM
llvm/include/llvm/Passes/StandardInstrumentations.h
71

AFAIK in C++, when pointee's lifetime ends, the pointer becomes invalid and you cannot use it in any meaningful way, even when you don't dereference it. It'd be incorrect to print it, compare with another pointers, etc. This would also make debugging very confusing.

yrouban updated this revision to Diff 289100.Sep 1 2020, 1:27 AM

Defined a separate class named CFG.
CFG defines the notion PreserveCFG for the multigraph of BasicBlocks with unordered successors.
CFG can safely track and print CFG changes. To guard CFG from accessing deleted blocks PoisoningVH is used and is extended with one getter.
The cfg diff printing function is rewritten.
Rebased.

Here is an example of an output of a unexpected CFG change detected after SimplifyCFG mistakenly reported PreservedCFG.

$ opt -passes=simplifycfg -S -verify-cfg-preserved --debug-only=stdinstrumentations

before SimplifyCFG
define i32 @test5(i32 %A) {
  switch i32 %A, label %return [
    i32 2, label %return
    i32 10, label %2
    i32 9, label %1
    i32 8, label %2
  ]

  ret i32 1

  ret i32 2

return:
  ret i32 0
}
after SimplifyCFG
define i32 @test5(i32 %A) {
  switch i32 %A, label %return [
    i32 8, label %2
    i32 10, label %2
    i32 9, label %1
  ]

1:                                                ; preds = %return, %2, %0
  %merge = phi i32 [ 1, %0 ], [ 2, %2 ], [ 0, %return ]
  ret i32 %merge

2:                                                ; preds = %0, %0
  br label %1

return:                                           ; preds = %0
  br label %1
}
Mocked error
Error: SimplifyCFGPass reported it preserved CFG, but changes detected:
In function @test5
Different number of non-leaf basic blocks: before=1, after=3
Non-leaf block return<0x23931c0> is added (1 successors)
Different successors of block entry<0x2393130> (unordered):
- before (3): return<0x23931c0>(2), unnamed_1<0x2393990>, unnamed_2<0x23938d0>(2),
- after (3): return<0x23931c0>, unnamed_1<0x2393990>, unnamed_2<0x23938d0>(2),
Non-leaf block unnamed_2<0x23938d0> is added (1 successors)
LLVM ERROR: Preserved CFG changed by SimplifyCFGPass
kuhar added a comment.Sep 1 2020, 9:02 AM

This looks very useful. Does it currently report any errors when enabled?

llvm/include/llvm/Passes/StandardInstrumentations.h
80

nit: NeedsGruard sounds like an external property that you need to satisfy for some reason. In this case, it seems to me that it's just an extra level of validation that we may want, right?
If that's the case, I'd consider renaming this to something like: UseGuargs, GuardBasicBlocks, TrackBBLifetime, etc.

llvm/lib/Passes/StandardInstrumentations.cpp
33

How much overhead does this have? It it's not negligible, we may want to enable it when expensive checks are enabled.

389

nit: !empty() or size() > 0

391

same as above

This looks very useful. Does it currently report any errors when enabled?

None.

llvm/lib/Passes/StandardInstrumentations.cpp
33

By default I proposed to enable it only for debug version. Do we track compile time for them?
I suggest that we leave this as is with additional comment to enable it for the expansive checks only if found a performance impact.
Basically for every function pass it runs twice over function blocks and edges and allocates linear memory. Should not be very big.

kuhar added inline comments.Sep 1 2020, 10:01 AM
llvm/lib/Passes/StandardInstrumentations.cpp
33

In general, I think we do care about it too. Could you run opt on some bigger .bc file to make sure there's little noticeable difference? I posted some .bc files in https://reviews.llvm.org/D83089.

kuhar added inline comments.Sep 1 2020, 11:19 AM
llvm/include/llvm/Passes/StandardInstrumentations.h
64–65

nit: could you move these just next to the forward-declaration for Module?

yrouban planned changes to this revision.Sep 3 2020, 5:12 AM
yrouban marked 2 inline comments as done.

I will collect performance data and fix for release builds.

llvm/include/llvm/IR/ValueHandle.h
506 ↗(On Diff #289100)

bug: Poisoned is defined only for debug builds. I will define a custom CallbackVH extension.

yrouban updated this revision to Diff 289856.Sep 3 2020, 9:36 PM

Implemented custom BBGuard class instead of PoisoningVH which does not track poisoning in release builds.
Removed DEBUG_TYPE and LLVM_DEBUG uses. Unexpected gaph diffs are printed unconditionally.
Addressed comments.
I will provide performance data soon.

yrouban updated this revision to Diff 289859.Sep 3 2020, 9:40 PM

a minor fix in BBGuard.

kuhar added inline comments.Sep 3 2020, 9:43 PM
llvm/include/llvm/Passes/StandardInstrumentations.h
74

nit: s/struct/class?

76

nit: is this function used?

yrouban updated this revision to Diff 289865.Sep 3 2020, 9:54 PM
yrouban marked 5 inline comments as done.

I see no impact on CPU time with the following timed runs (both with and without assertions):
$ time opt -O3 FILE.bc --disable-output -verify-cfg-preserved=VCP

LLVM_ENABLE_ASSERTIONS=ON:

FILEverify-cfg-preservedRun 1 [s]Run 2 [s]Run 3 [s]Run 4 [s]Run 5 [s]AvgStd/Avg %DeltaAvg%
llvm-as.bcfalse549.010544.270550.010547.620547.360547.6540.40
true548.640547.670547.690548.900548.680548.3160.110.12
sqlite3.bcfalse58.12058.18058.21058.13058.19058.1660.07
true58.23058.12058.13058.24058.21058.1860.100.03
wasm-as.bcfalse168.880169.490168.610169.770168.150168.9800.39
true168.660169.730169.360169.620168.500169.1740.330.11
wasm-opt.bcfalse181.860180.550181.340180.790181.910181.2900.34
true181.240181.620181.180181.780180.640181.2920.240.00

LLVM_ENABLE_ASSERTIONS=OFF:

FILEverify-cfg-preservedRun 1 [s]Run 2 [s]Run 3 [s]Run 4 [s]Run 5 [s]AvgStd/Avg %DeltaAvg%
llvm-as.bcfalse408.650408.730410.100406.650410.110408.8480.35
true408.980408.260408.410409.180408.600408.6860.09-0.04
sqlite3.bcfalse39.72039.77039.79039.72039.78039.7560.08
true39.69039.73039.74039.76039.75039.7340.07-0.06
wasm-as.bcfalse117.270117.460117.450117.410117.250117.3680.09
true117.640117.100117.400117.210117.310117.3320.18-0.03
wasm-opt.bcfalse125.900126.030125.910126.020126.180126.0080.09
true125.940125.930126.100125.980125.920125.9740.06-0.03
kuhar accepted this revision.Sep 8 2020, 7:05 AM

Thanks for the table, this looks good to me now, modulo the two remaining nits. I'd be more comfortable if somebody familiar with the new pass manager reviewed this too.

This revision is now accepted and ready to land.Sep 8 2020, 7:05 AM

Fedor, could you please review the New Pass Manager part of this patch as I addressed your comments?

NewPM use seems to be fine.
A few relatively minor comments that I would like to see addressed.

llvm/include/llvm/Passes/StandardInstrumentations.h
73

Are we ok with adding a generic 'CFG' class directly into llvm namespace by StandardInstrumentations.h?
If it is only for CFGChecker introduced here then either put it into Checker itself or into a nested namespace named appropriately
(e.g. cfg_checker_detail).

If you can see this class being generally useful then it deserves its own header/location.

104–105

It seems that GraphStack and PassStack are always pushed/popped synchronously.
Why dont you keep a single vector of (StringRef, Optional<CFG>) pair to make this couple more obvious?

llvm/lib/Passes/StandardInstrumentations.cpp
481

This seems to be a more new-PM-idiomatic way of checking:

PassPA.allAnalysesInSetPreserved<CFGAnalyses>()
yrouban marked 3 inline comments as done.Sep 10 2020, 1:27 AM
yrouban added inline comments.Sep 10 2020, 1:32 AM
llvm/include/llvm/Passes/StandardInstrumentations.h
73

That is why I tried introducing the class Instrument that would encapsulate all declarations and implementations of PrintIRInstrumentation, PrintPassInstrumentation, TimePassesHandler, OptNoneInstrumentation, PreservedCFGCheckerInstrumentation in one cpp file :)

yrouban updated this revision to Diff 290903.Sep 10 2020, 1:32 AM

Addressed Fedor's comments.

This revision was automatically updated to reflect the committed changes.