After this change, MLIR will trigger Instrumentation callbacks on OpToOpAdaptorPass itself with the module as GetOperation(). A new GetLongName() method is added to Pass to allow formatting an informative name (like OpToOpAdapterPass[[FunctionalToExecutorConversionPass]]) that is used for dumping. Note that OpToOpAdaptorPass is in mlir::detail and cannot be used in user subclasses of IR printing Config.
Before this change, MLIR dumping of OpToOpAdaptor passes are only conducted on individual container nodes (such as FuncOp). This becomes verbose when we dump DTensor MLIR rewrites of relatively large graphs from real TensorFlow models.
The user can distinguish between an OpToOpAdaptor full pass callback or a per FuncOp callback by checking if the Op is a module op. This behavior is not as straightforward as I'd like it to be (and I'd like to get some advises from the MLIR maintainers).
We have been using this patch with locally in debugging DTensor: the change reduced the number of dumps by a few thousand, which translated to a great time saving when the dumping was to a slower remote file system.
Instead of this, what about changing things this way: