HomePhabricator

[Verifier] Parallelize verification and dom checking. NFC.

Authored by lattner on Jun 13 2021, 9:42 PM.

Description

[Verifier] Parallelize verification and dom checking. NFC.

This changes the outer verification loop to not recurse into
IsolatedFromAbove operations - instead return them up to a place
where a parallel for loop can process them all in parallel. This
also changes Dominance checking to happen on IsolatedFromAbove
chunks of the region tree, which makes it easy to fold operation
and dominance verification into a single simple parallel regime.

This speeds up firtool in CIRCT from ~40s to 31s on a large
testcase in -verify-each mode (the default). The .fir parser and
module passes in particular benefit from this - FModule passes
(roughly analogous to function passes) were already running the
verifier in parallel as part of the pass manager. This allows
the whole-module passes to verify their enclosed functions /
FModules in parallel.

-verify-each mode is still faster (26.3s on the same testcase),
but we do expect the verifier to take *some* time.

Differential Revision: https://reviews.llvm.org/D104207

Event Timeline

bondhugula added inline comments.
/mlir/lib/IR/Verifier.cpp
102–109

Why is atomic needed here? You aren't reading from passFailed. Instead:
SmallVector<bool, 512> passFailed(opsW....size(), false) and passFailed[opIdx] = true` if I've read this code (locally) right.

bondhugula added inline comments.Jun 16 2021, 1:37 PM
/mlir/lib/IR/Verifier.cpp
102–109

(fixed format)

Why is atomic needed here? You aren't reading from passFailed. Instead:
SmallVector<bool, 512> passFailed(opsW....size(), false); and passFailed[opIdx] = true if I've read this code (locally) right.

lattner added inline comments.Jun 16 2021, 2:21 PM
/mlir/lib/IR/Verifier.cpp
102–109

You could do that, but passes generally don't fail. A smallvector would require memory allocation and a bunch of overhead in the common case. An atomic is both simpler and faster.

Let me flip this around: what is wrong with an atomic?

102–109

Also, this does read from passFailed, right after the parallelForEach

bondhugula added inline comments.Jun 18 2021, 3:44 AM
/mlir/lib/IR/Verifier.cpp
102–109

I see - given the common case, atomic is clearly better.