This is a story about man against machine (me vs. dyld)...
This is a story about two worlds separated by seemingly arbitrary and capricious differences (Linux and Mac OS)...
This is the greatest love story ever told (MLIR and extensibility)...
Okay in all seriousness this took an inordinate amount of time so I'm going to belabor a little what it took to figure it out. The TL;DR (if you don't enjoy dramatic retellings of bug squashing) is that we can use add_llvm_library with MODULE to make it work on Linux, Mac, and possibly Windows (with PLUGIN_TOOL).
Opening scene (the problem)
The previous patch (D147053) seemingly sometimes worked and sometimes didn't. In particular, on Mac it worked but you had to fudge things - adding -undefined dynamic_lookup to StandalonePlugin got some of the tests to pass but failed others, then removing the statically linked MLIR libs in MLIRStandalone got more to pass but that's hacky^2. In totum it seemed like a really weird asymmetry that implied linking on Mac worked dramatically differently from linking on Linux. It would've been plausible if they were only subtley different but dropping statically linked libs from MLIRStandalone just made it too suspicious.
Hero's journey (the debugging process)
So the suspicion was that at runtime dyld on Mac was doing something different from ld on Linux. On Linux you can use LD_DEBUG_OUTPUT=ldlog LD_DEBUG=all LD_BIND_NOT=1 to see all of the bindings as they happen (i.e., where the code implementing the symbol is fetched from):
binding file libStandalonePlugin.so  to cmake-build-release/bin/mlir-opt : normal symbol `_ZN4mlir19PassInstrumentation17runBeforePipelineESt8optionalINS_13OperationNameEERKNS0_18PipelineParentInfoE' binding file libStandalonePlugin.so  to /home/mlevental/dev_projects/llvm/cmake-build-release/bin/mlir-opt : normal symbol `_ZN4mlir19PassInstrumentation16runAfterPipelineESt8optionalINS_13OperationNameEERKNS0_18PipelineParentInfoE' binding file libStandalonePlugin.so  to ./tools/mlir/test/Examples/standalone/lib/libStandalonePlugin.so : normal symbol `_ZNK4mlir10standalone17StandaloneDialect9parseTypeERNS_16DialectAsmParserE' binding file libStandalonePlugin.so  to ./tools/mlir/test/Examples/standalone/lib/libStandalonePlugin.so : normal symbol `_ZNK4mlir10standalone17StandaloneDialect9printTypeENS_4TypeERNS_17DialectAsmPrinterE'
Inspecting this showed that indeed (as expected) almost all symbols are bound to mlir-opt rather than libStandalonePlugin.so on Linux. On Mac you can do a similar thing with DYLD_PRINT_SEARCHING=1 DYLD_PRINT_BINDINGS=1 DYLD_PRINT_LIBRARIES=1 but it doesn't show all the bindings. Ultimately it took setting breakpoints and printfing to reveal that what was happening was mlir::registerPass was being bound to libStandalonePlugin.dylib instead of mlir-opt and thus two separate static ... passRegistrys were being instantiated. But why???
At some point I got the bright (sarcasm - should've been the first) idea to look at how LLVM plugins work, in particular what linker flags are used; inspecting my build.ninja I found the auspicious bit:
############################################# # Link the shared module unittests/Analysis/InlineOrderPlugin.dylib build unittests/Analysis/InlineOrderPlugin.dylib: ... LINK_FLAGS = -L/opt/homebrew/opt/llvm/lib -Wl,-flat_namespace -Wl,-undefined -Wl,suppress ...
What is -Wl,-flat_namespace you might be wondering?
All references to interposable symbols can be redirected at runtime to
point to a different symbol definition (with the same name). For
example, if both dylib A and B define symbol _foo, and we load A before
B at runtime, then all references to _foo within dylib B will point to
the definition in dylib A.
[dyld] makes all extern symbols interposable when linking with -flat_namespace.
from none other than our very own D119294. These flags are applied on APPLE when linking MODULEs. Et voila - using LLVM's add_llvm_library with MODULE basically worked without a hitch (see inline comments for small hitch).
Maybe we want our own macro instead of add_llvm_library? Or an extension of add_mlir_library? I gave it a shot but you need to thread a few more things than just MODULE.
At some point I had a solution that left symbols to be resolved at runtime in MLIRStandalone for both Linux and Mac (i.e., no static linking at all). On Mac, as already mentioned, this requires -undefined dynamic_lookup> but on Linux I also had to pass -Wl,--unresolved-symbols=ignore-in-object-files and that was very strange because everywhere you read that on Linux, deferring symbol to resolution to runtime is the default behavior. Well it turns out we don't like defaults and force -z,defs. Probably wise but definitely confusing.
Btw, if you want to try just the test that intersects with this patch you can filter lit like this:
-DLLVM_LIT_ARGS="-v --show-unsupported --filter=Standalone.*"