This is an archive of the discontinued LLVM Phabricator instance.

Split headers from implementations in MLIR C API Bazel build.
ClosedPublic

Authored by phawkins on Nov 10 2021, 6:15 AM.

Details

Summary

This allows clients to build, e.g., the Python bindings against the C API headers, without including the C API implementations. This is useful when distributing software as multiple shared libraries.

Diff Detail

Event Timeline

phawkins created this revision.Nov 10 2021, 6:15 AM
phawkins requested review of this revision.Nov 10 2021, 6:15 AM
phawkins updated this revision to Diff 386145.Nov 10 2021, 6:49 AM

Add some missing alwayslink annotations.

GMNGeoffrey added inline comments.Nov 10 2021, 9:28 AM
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
320

Could this be a glob with an exclude for each of the things in a separate library? I think that would be easier to maintain

522

Why isn't this one also split?

591–594

Factor this into a variable also?

715

This is causing build failures. There isn't any pybind11 repository configured in the workspace here, so all transitive dependencies on pybind11 need to be tagged "nobuildkite" and "manual". Even if we had pybind11 configured in this WORKSPACE, we've marked everything that has a transitive external dependency as "manual" so a wildcard build can always build everything with no external dependencies. Carrying forward the tags is a bit annoying, but I don't think there's a way in Bazel to have transitive tags or something like that.

720

😢 alwayslink is a scourge. Is it really the only way?

mehdi_amini added inline comments.Nov 10 2021, 9:41 AM
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
720

Same concern here: why is this needed?

phawkins updated this revision to Diff 386209.Nov 10 2021, 9:48 AM
phawkins marked 4 inline comments as done.

Incorporate review comments.

phawkins added inline comments.Nov 10 2021, 9:49 AM
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
320

It could be, but note that's not something this PR changes: here we just move the list around. If we want to do that, can we do it in another PR?

522

It's not split because I'm personally not using it. But done.

720

I removed it from this target, where I don't think it's needed. In general it *is* needed on all the libraries that implement the exported C API, because otherwise Bazel will strip the C API out.

GMNGeoffrey added inline comments.
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
320

Separate patch SGTM. I only mentioned it because you were already moving this list around.

720

Why wasn't it needed before? We're using this internally without issue IIUC. We've worked hard in many places to scour this from the codebase, so I don't love seeing it return

@stellaraccident can you weigh in on these build rules?

phawkins added inline comments.Nov 10 2021, 2:23 PM
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
720

It wasn't needed before, because internally we build monolithic binaries.

Here I'm doing something different: I'm building one shared library that vends the C API (capi.so) and another shared library (a Python extension) that consumes it. Without alwayslink, the C API .so file will not contain the C API: it ends up completely empty because there are no *users* of the API present and everything ends up stripped out. I did not put this here idly.

I note this is exactly how IREE deploys things, although IREE does it with CMake not Bazel.

phawkins added inline comments.Nov 10 2021, 2:42 PM
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
720

To add to that, from the Bazel docs, the definition of alwayslink is:
"If 1, any binary that depends (directly or indirectly) on this C++ precompiled library will link in all the object files archived in the static library, even if some contain no symbols referenced by the binary. This is useful if your code isn't explicitly called by code in the binary, e.g., if your code registers to receive some callback provided by some service."

That's precisely what we want here, for use in a rule like this:

cc_binary(
    name = "mlir_capi.so",
    deps = [
        "@llvm-project//mlir:MLIRBindingsPythonCAPI",
    ],
    linkshared = 1,
    linkopts = ["-Wl,-soname=mlir_capi.so", "-Wl,-rpath='$$ORIGIN'"],
)

The only alternative I can think of would be to have two copies of the implementation rules, one with alwayslink and one without, or a macro that does that.

GMNGeoffrey added inline comments.Nov 10 2021, 3:23 PM
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
720

We had a higher-bandwidth discussion in a side channel and agreed that it was best not to expose alwayslink to unsuspecting users. We'll use a macro to create separate targets for the normal library, the headers-only version, and the version with alwayslink. This limits the weirdness to users doing things that are not very well supported in Bazel (but hopefully will be with cc_shared_library)

mehdi_amini added inline comments.Nov 10 2021, 5:12 PM
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
720

That's precisely what we want here, for use in a rule like this:

I'm not sure I understand this.
Your quote is about "any binary that depends ...", it isn't clear to me that this applies to shared libraries.

For example if I test with this simple BUILD file:

cc_binary(
    name = "test.so",
    deps = [
        ":test_lib",
    ],
    linkshared = 1,
    linkopts = ["-Wl,-soname=mlir_capi.so", "-Wl,-rpath='$$ORIGIN'"],
)

cc_library(
  name = "test_lib",
  srcs = [
    "test.cc"
  ]
)

The test.cc file will be linked into the .so ; the linker invocation will be -shared -o bazel-out/k8-fastbuild/bin/libtest_lib.so -Wl,-S -fuse-ld=gold -Wl,-no-as-needed -Wl,-z,relro,-z,now -B/usr/bin -pass-exit-codes bazel-out/k8-fastbuild/bin/_objs/test_lib/test.pic.o -lstdc++ -lm
And nm on the final .so shows the symbol from test.cc present.

So I still don't know what is the situation you're having that is different here?

phawkins updated this revision to Diff 386369.Nov 10 2021, 5:24 PM

Use macro to generate multiple targets, rather than unconditionally using alwayslink.

Please take a look.

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
720

This does not match the behavior I am seeing. I ran this:

mkdir myproject
cd myproject
touch WORKSPACE
cat > BUILD.bazel <<EOF
cc_binary(
    name = "test.so",
    deps = [
        ":test_lib",
    ],
    linkshared = 1,
    linkopts = ["-Wl,-soname=mlir_capi.so", "-Wl,-rpath='$$ORIGIN'"],
)

cc_library(
  name = "test_lib",
  srcs = [
    "test.cc"
  ],
  alwayslink=True,
)
EOF

cat > test.cc <<EOF
int veryobvioussymbol(int x) { return x * 2; }
EOF

bazel-4.1.0-linux-x86_64  build :test.so
nm bazel-bin/test.so | grep veryobvious

I see a non-empty grep output if and only if alwayslink=True is set on the cc_library.

mehdi_amini added inline comments.Nov 10 2021, 5:25 PM
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
720

Actually I looked that the linker invocation of the libtest_lib.so instead of libtest.so... So scratch that Bazel is not pulling in the static archive in linking the shared library here...

phawkins updated this revision to Diff 386372.Nov 10 2021, 5:30 PM

Fixed typo in comment.

GMNGeoffrey accepted this revision.Nov 10 2021, 5:55 PM
GMNGeoffrey added inline comments.
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
587

Not necessary for this patch since the other target already exists, but we should be able to pull these targets together and use the macro here as well, right?

666

I'm confused why this target exists. Are these ones meaningfully a different pattern than the others such that they can't use the same macro?

684

You don't want alwayslink here?

720

Ok so Mehdi you agree that alwayslink is necessary?

utils/bazel/llvm-project-overlay/mlir/build_defs.bzl
34

Nit: I think **kwargs would be standard? Searching inside google **kwargs is a few orders of magnitude more common in bzl files (and an order of magnitude more common in py files)

This revision is now accepted and ready to land.Nov 10 2021, 5:55 PM
mehdi_amini added inline comments.Nov 10 2021, 6:12 PM
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
720

Yeah I agree (I guess until cc_shared_library?)

Seems like this changed in Bazel at some point: https://docs.bazel.build/versions/main/command-line-reference.html#flag--legacy_whole_archive

I really don't know why they thought it'd be a good design :(

mehdi_amini added inline comments.Nov 10 2021, 6:19 PM
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
720

Actually this is the best discussion place: https://github.com/bazelbuild/bazel/issues/7362#issuecomment-548031232

Seems like users complained the same way I would have, but they didn't really get an answer.

Someone wrote an interesting wrapper here: https://github.com/RobotLocomotion/drake/pull/12262/files ; it seems like it allows to define a shared library and the rule will itself update the dependencies of the shared library to mark them "alwayslink" without having to change them in the build file.

phawkins updated this revision to Diff 386488.Nov 11 2021, 5:50 AM

Use kwargs instead of kw

Please take a look.

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
587

You could, but I'm not convinced it's a good idea in this case. Here we actually do *not* want alwayslink (this is just a normal library: the only important part is which variant of the C API we link to), so the macro is actually not helpful.

666

This exists to bundle together the C API deps needed by the Python bindings. A key point is that the deps are different in all three targets (header-only, deps, deps-for-shared-object), and we never need alwayslink. So the macro doesn't apply.

684

It's not needed. It's sufficient to have it on the cc_library targets that define the symbols we want to make into the final linker command.

utils/bazel/llvm-project-overlay/mlir/build_defs.bzl
34

I think both are pretty common, and my practice is always to prefer to save a few characters. But I don't feel strongly about it; done.

GMNGeoffrey accepted this revision.Nov 11 2021, 8:31 AM

Thanks for addressing the comments here! Sorry for all the back and forth :-) I assume this is ready to land?